Limit order books 



Martin D. Gould/'^'^'Q Mason A. Porter,^'^ Stacy Williams,'^ Mark McDonald,'^ Daniel J. Fenn,'^ and Sam D. 
Howison-"^' ^ 

^ Oxford Centre for Industrial and Applied Mathematics, 

Mathematical Institute, 

University of Oxford, 

Oxford 0X1 3LB, 

UK 

^Oxford-Man Institute of Quantitative Finance, 
University of Oxford, 
Oxford, 0X2 6ED, 
UK 

^CABDyN Complexity Centre, 
University of Oxford, 
Oxford 0X1 IHP, 
UK 

'^FX Quantitative Research, 
HSBC Bank, London E14 5HQ, 
UK 

Limit order books are used to match buyers and sellers in more than half of the world's 
financial markets, and have been studied extensively in several disciplines during the 
past decade. This survey highlights the many insights from the wealth of empricial 
and theoretical studies that have been conducted, and the numerous unsolved problems 
that remain. We illustrate the differences between observations from empirical studies 
of limit order books and the models that attempt to replicate them. In particular, 
many modelling assumptions are poorly supported by data and several well-established 
empirical facts have yet to be reproduced satisfactorily by models. By examining existing 
models of limit order books, we identify some key unresolved questions and difficultes 
currently facing researchers of limit order trading. 



CONTENTS 



I. Introduction [2] 

II. A Mathematical Description of Limit Order Trading [2] 

A. Preliminaries [3] 

B. Price changes in limit order books [5] 

C. Orders: the building blocks of a limit order book [6] 

D. The rise in popularity of limit order books [6] 

III. Limit Order Book Challenges H 

A. Different perspectives [7] 

B. State-space complexity [7] 

C. Feedback H 

D. Coupling between b{t) and a(i) |8] 

E. Quantifying patience |8] 

F. Priority H] 

G. Incomplete sampling and hidden liquidity [9] 

H. Volatility [lO] 

1. Model-based estimates of volatility 1111 

2. Model-free estimates of volatility 1111 

3. Time window effects 1111 
I. Resolution parameters 1121 
J. Bilateral trade agreements 1121 
K. Opening and closing auctions 1131 
L. Power laws 1131 
M. Long-Range Correlations 1141 

1. Long- and short-memory processes 1141 

2. Practical considerations 1151 



* Corresponding author: gouldm@maths.ox.ac.uk| 



3. The Hurst exponent 1151 

4. Estimators of H [15] 

IV. Empirical Observations in Limit Order Markets 1161 

A. Order size 1161 

B. Relative price 1171 

C. Limit order cancellations 1181 

D. Mean relative depth profile 1181 

E. Volatility [18] 

F. Conditional event frequencies lljjl 

1. Order size 1191 

2. Relative price 1191 

3. Arrival rates 1201 

4. Cancellations 1211 

5. Price movements 1211 

6. Incoming order prices 1211 

7. Order flow [22] 

8. Limit order book state 1221 

G. Market impact and price impact 1221 

1. Instantaneous price impact 1241 

2. Permanent price impact 1251 

3. Market impact 1251 

H. StyUzed facts [26] 

1. Heavy-tailed return distribution 1261 

2. Autocorrelation of returns 1271 

3. Long memory 1271 

V. Limit Order Book Models [28] 

A. Perfect-rationality approaches 1281 

1. Cut-ofl" strategies [28] 

2. Fundamental values and informed traders 1291 

3. Minimizing market impact 1301 

B. Zero-intelligence approaches 1311 
1. Model framework 1311 



2 



2. Random- walk diffusion models 1311 

3. Discrete-time models 1321 

4. Continuous-time models 1321 

5. Beyond zero intelligence 1341 
C. Agent-based models 1351 

VI. Discussion 1361 

Acknowledgments 1381 

References 1381 

A. Table of Empirical Studies |43] 



I. INTRODUCTION 

It is an age-old problem to determine the price at which 
to conduct a trade. In the highly competitive and re- 
lentlessly fast-paced markets of today's financial world, 
limit order books (LOBs) are used to match buyers and 
sellers in more than half of the world's financial mar- 



kets (Ro§u 2009). Euronext, the Australian Securities 



Exchange, and the Helsinki, Hong Kong, Swiss, Tokyo, 
Toronto, Vancouver, and Shenzhen Stock Exchanges all 



now operate as pure limit order markets ( Gu et al. 2008b 



Luckock[|2001| ); the New York Stock Exchange (N YSE), 
NASDAQ, and the London Stock Exchange (LSE) (|Cont 



et al. 2010 ) all operate a bespoke hybrid limit order sys- 



tem. Thanks to technological advances, traders around 
the globe now have real-time access to the current LOB, 
providing buyers and sellers alike "the ultimate micro- 



scopic level of description" (Bouchaud et all 2002). 



There are obvious practical advantages to understand- 
ing the dynamics of the LOB. These include gaining 
clearer insight into the relative financial merits of dif- 



ferent kinds of orders in given situations (Harris and 



Hasbrouck 1996 ) 



(Obizhaeva and Wang| 



mization (Eisler et al. 



optimal order execution strategies 
and market impact mini- 
By providing researchers 



2005 



2010 



with more detailed information about order flow than 
has ever been available before, the study of LOBs has 
helped illuminate possible causes for well-established em- 
pirical regularities that have been observed across a wide 



range of markets (Bouchaud et al. 2009 Farmer and 



Lillo 2004). Furthermore, a LOB is a prime example 



of a complex system (Mitchell 2009), where global phe- 



nomena arise from the behaviour of many interacting 
agents when the system throughput becomes sufficiently 
large. The unusually rich and high-quality historic data 
that is available from some LOBs make them a suitable 
testing ground for a wide variety of ideas from the com- 
plex systems literature, such as universality, scaling, and 
emergence. In this survey, we discuss some of the key 
ideas that have emerged from the analysis and modelling 
of limit order trading in recent years. We also examine 
existing models of limit order trading and discuss their 
strengths and limitations. 



This survey is preceded by a number of other survey ar- 
ticles that each focus on particular aspects of the mecha- 
nism. Friedman ( 2005 ) reviewed early studies of "double 



auction" style trading (of which the LOB is an example) . 
Parlour and Seppi (2008) addressed the economic and 



theoretical aspects of the trading mechanism. Bouchaud 



et al. ( 2009 ) assessed the current understanding of price 



formation in limit order markets. IChakraborti et al.\ 
( 2011a|b l examined the role of econophysics in under- 
standing LOB behaviour. In light of these studies, we do 
not focus heavily on these aspects here. 

Analyses of LOBs have taken a variety of starting 
points, drawing on ideas from economics, physics, math- 
ematics, statistics, and psychology. Unsurprisingly, there 
is no clear consensus on the best approach. This point 
is exemplified by the contrast between the "bottom-up" 
approach normally taken in the economics literature, in 
which models focus on the behaviour of individual market 
participants and present the LOB as a sequential game 
(|Foucault[ [19991 |Parlour| |1998[ |Ro§u] |2009|) , with the ap- 



proach in the physics literature, in which order flows are 
treated as random and techniques from statistical me- 
chanics are employed to explore LOB equil ibria ([Challet 



and Stinchcombe[ [200T| |Cont et al] [20T0l [Smth et al.[ 
2003). In the current paper, we discuss developments in 



both the economics and physics literatures, while empha- 
sizing the aspects of limit order trading most relevant to 
practitioners. 

The remainder of the survey is organized as follows. In 
Section|Tlj we discuss several formal deflnitions related to 
limit order trading, in order to formulate a mathemati- 



cal description of the process. In Section HI we discuss 
a variety of practical aspects of limit order trading and 
examine the mathematical difficulties that arise from at- 
tempting to quantify them. In Section |IV[ we examine 
the important role of empirical studies in deepening un- 
derstanding of the LOB, highlighting both consensus and 
disagreement within the literature. We examine a selec- 
tion of models in Section |V] Finally, Section VI contains 
our conclusions and discussions of what we believe to be 
the key unresolved issues in the field. 



II. A MATHEMATICAL DESCRIPTION OF LIMIT 
ORDER TRADING 

In this section, we formulate a precise description of 
limit order trading that is intended to describe the key 
aspects common to most limit order markets. Of course, 
individual exchanges and trading platforms each operate 
under slight variations of these core principles. |Harris| 



(2003) provides a comprehensive review of specific de- 



tails governing particular exchanges, so we do not focus 
heavily on them here. 



3 



A. Preliminaries 

Before limit order trading grew in popularity, most fi- 
nancial trades took place in quote-driven marketplaces, 
where a handful of large market makers centralize "buy" 
and "sell" orders by publishing the prices at which they 
are willing to buy and sell the asset being traded. The 
sell price will always be higher than the buy price, al- 
lowing the market maker to earn a profit in exchange for 
providing liquidity to the markelj^and taking on the risks 
of market making. Such risks include acquiring an unde- 
sirable inventory position that later has to be unwound 
and being exposed to adverse selection (i.e., encounter- 
ing other market participants who have better informa- 
tion about the value of the asset and who can therefore 
make a profit by buying or selling, often repeatedly, with 



the market maker ( Parlour and Seppi 2008 ) ) . Any other 



market participants who want to buy or sell the asset 
only have access to the prices made publicly available by 
these market makers, and the only action available to a 
market participant who wishes to buy (respectively, sell) 
the asset is to immediately buy (respectively, sell) at one 
of the prices that the market makers have made avail- 
able. The purchasing and sale of tickets via ticket touts 
is an example of a quote-driven market in action. 

A limit order market is much more flexible because 
every market participant has the option of posting buy 
(respectively, sell) orders: 

Definition. An order x = {p.j.,uJx) o,i price and of 
size LOx > Q ( respectively, lu^ < 0) is a commitment to 
sell (respectively, buy) up to jw^l units of the asset being 
traded at a price no less than (respectively, no greater 
than ) Px ■ 

The units of order size are set by the following: 

Definition. The lot size, denoted a, is the smallest 
amount of the asset that can be traded in the mar- 
ket. Furthermore, all order^ must arrive with a size 

LOx e {k(T I fc e Z}. 

The units of price are set by the following: 

Definition. The tick size 5p is the smallest price interval 
between different orders that is permissible in the market. 
Furthermore, all orders must arrive with a price that is 
specified to the accuracy of Sp. 



For example, if Sp = $0.00001, then the largest per- 
missible order price that is strictly less than $1.00 is 
$0.99999, and any order would have to be submitted at 
a price with exactly 5 decimal places. 

Definition. The lot .size a and the tick size Sp are col- 
lectively called the LOB's resolution parameters. 

When a buy (respectively, sell) order is submitted, a 
LOB's trade-matching algorithm checks whether it is pos- 
sible to perform a matching to some other previously sub- 
mitted sell (respectively, buy) order. If so, the matching 
occurs immediately. If not, the newly submitted order 
becomes active; it remains active until either it becomes 
matched to another incoming sell (respectively, buy) or- 
der or it is cancelled. 

Definition. An active order at time t is an order that 
has been submitted at some time t' < t, but has not been 
matched or cancelled by time t. 

Cancellation usually occurs because the owner of an 
order no longer wishes to offer a trade at the stated price, 
but rules governing the marketplace can also lead to the 
cancellation of orders if they remain unmatched for a 
specified period of time, or at certain times of day. For 
example, on the electronic limit order trading platform 
Hotspot FX, all active orders are cancelled at 5pm EST 
to prevent too large an accumulation of active orders over 



time (Gould et al. 2011) 



It is precisely the active orders in a market that make 
up the LOB: 

Definition. The LOB L(t) is the set of all active orders 
in the market at time t. 

The LOB is then a set of queues, with each queue con- 
sisting of active buy or sell orders at the specified price. 
The concepts of the bid price, ask price, mid price, and 
bid-ask spread, common to much of the finance literature, 
can be made specific in the context of limit order trading: 

Definition. The bid price at time t, denoted b{t), is the 
highest stated price among active buy orders in the LOB 
L{t), 



m 



max Px 

{xeL(t)\uj^<Q} 



^ Liquidity is difficult to define formally. |KyIe| | |l985| instead iden- 
tified the three key properties of a liquid market to be tightness 
("the cost of turning around a position over a short period of 
time"), depth ("the size of an order-flow innovation required to 
change prices a given amount") and resiliency ("the speed with 
which prices recover from a random, uninformative shock"). 

^ In some real markets there are two lot-size parameters: a min- 
imum size a and and increment e. In such markets, sell orders 
must arrive with a size £ {cr + fce | fc = 0, 1, 2, . . .} and buy 
orders must arrive with a size u)x G {— (o" + fce) | fc = 0, 1, 2, . . .}. 
For simplicity, we assume a = e. 



Definition. The ask price at time t, denoted a{t), is the 
lowest stated price among active sell orders in the LOB 
Lit), 

a(t) = min Px- 

{xeL{t}\uj^>o} 

Definition. The bid-ask spread at time t, denoted s{t), 
is the difference between the ask price and the bid price 
at time t: s(t) = a(t) — b(t). 



4 



Definition. The mid price at time t, denoted m(t), is 
the average of the ask price and the bid price at time t: 

In a limit order market, b{t) is the highest price at 
which it is possible to sell immediately at least the lot 
size of the asset being traded, and a{t) is the lowest price 
at which it is possible to buy immediately at least the lot 
size of the asset being traded, at time t. 

Sometimes it is more helpful to consider prices relative 
to b{t) and a(t), rather than actual prices]^ 

Definition. For a given price p, the bid-relative price 
Ah(p) is the difference between the bid price and the given 
price: Ai,{p) — b{t) ~p. 

Definition. For a given price p, the ask-relative price 
Aa(p) is the difference between the given price and the 
ask price: Aa(p) — p — a(t). 

Notice the difference in signs between the two above 
definitions: Af,(p) measures the distance that p is "be- 
hind" b{t), i.e., how much smaller p is than b{t); Aq(p) 
measures the distance that p is "behind" a{t), i.e., how 
much larger p is than a{t). 

Often it is desirable to compare orders on the bid side 
and the ask side of the limit order book. In these cases, 
the concept of a single relative price of an order is useful: 

Definition. For a given order x = {px,u}x)j the relative 
price of the order, denoted A^, is defined as 

^ ( Ai,{px), if the order is a buy order, 
^ \ Aa{px), if the order is a sell order. 

Arriving orders may then be partitioned according 
their relative price: 

• Any order x that arrives with a relative price A^^ > 
will not cause an immediate matching, instead 
becoming an active order "inside the book" . 

• Any order x that arrives with a relative price 
— s(i) < Ax < will also not cause an immediate 
matching, instead becoming an active order "inside 
the spread". 

• Any order x that arrives with a relative price A^, < 
—s{t) will cause an immediate matching. 

In order to assess the state of L{t), most market par- 
ticipants view the depth profile or relative depth profile 
of the LOB: 



^ In the existing literature, a wide range of different naming and 
sign conventions are used by different authors to describe slightly 
different definitions of the concept of relative price. Here, we in- 
troduce the explicit distinction between bid-relative price and 
ask-relative price in an attempt to remove any potential confu- 
sion. 



Definition. The bid-side depth available at price p and 

at time t, denoted nh{p,t), is the total size of all active 
buy orders at price p, 

Ubip^t) ^ ^ LUx- 

{xeL{t)\p^=PM^<o} 

The ask-side depth available at price p and at time t, 
denoted na{p,t), is the total size of all active sell orders 
at price p, 

{xeL(t)\p^=p,Lj^>o} 

The depth available is often stated in multiples of the 
lot size. Notice that because uJx < for buy orders and 
ujx > for sell orders, it follows that nb{p,t) < and 
"a {p, t) >0 for all prices p. 

Definition. The bid-side depth profile of the LOB at 
time t is the set of all ordered pairs {p,nb {p,t)). The 
ask-side relative depth profile of the LOB at time t is the 

set of all ordered pairs [p, ria (p, t)). 

Definition. The mean bid-depth available at price p be- 
tween times ti and t2, denoted nb{p,ti,t2), is 

1 

-, -r I nb{p,t)dt. 

I2 - II Jti 

The mean ask-depth available at price p between times 
ti and denoted na{p,ti,t2), is defined similarly, using 
the ask-side depth available. 

Because b(t) and a(t) themselves vary over time, it is 
rarely illuminating to consider the depth available at a 
specific price over time. However, relative pricing pro- 
vides a useful alternative: 

Definition. The bid-side relative depth available at 
price p and at time t, denoted Nb{p,t), is the total size 
of all active buy orders with relative price p, 

{xeL{t}\A^=p,uj^<a} 

The ask-side relative depth available at price p and at 
time t, denoted Na{p,t), is the total size of all active sell 
orders with relative p, 

{x£L{t)\A^=p,Lj^>Q} 

Definition. The bid-side relative depth profile of 
the LOB at time t is the set of all ordered pairs 
{Ax, Nb {Ax,t)) . The ask-side relative depth profile 
of the LOB at time t is the set of all ordered pairs 

{Ax,Na{Ax,t)). 



5 



Definition. The mean bid-depth available at relative 
price p between times ti and t2, denoted Ni,{p,ti,t2), is 



1 



Nbip,t)dt. 



The mean ask-deptli available at relative price p between 
times ti and t2, denoted Na{Pi ti, t2), is defined similarly, 
using the ask-side relative depth available. 

Definition. The mean bid-side relative depth profile 
between times ti and t2 is the set of all ordered pairs 
(A3;(p), ^;,(Aa;(p), ii, i2))- The mean ask-side relative 
depth profile between times ti and t2 is the set of all 
ordered pairs {Aj;{p), Na{^xip),ti,t2)). 

The relative depth profile provides no information 
about the actual prices at which trades occur, nor does it 
contain information about s{t) or m{t). However, order 
arrival rates have been widely observed to depend on rel- 



ative prices and not actual prices (see, e.g., (Biais et al. 


1995 


Bouchaud et al. 2002||Potters and Bouchaud||2003 


Zovko and Farmer 


2002 


)), so it is common to consider 



the relative depth profiles and b{t) and a{t) together, in 
order to gain the full picture of LOB dynamics. 

It is often also useful to study the properties of all 
active limit orders together. 

Definition. The total depth available on the bid side of 
the LOB at time t, denoted Nh{t), is the total size of all 
active buy orders at time t, 



Nbit) 



{xeL{t)\uj^<o} 



The total depth available on the ask side of the LOB at 
time t is defined similarly: 



Na{t) 



{xeL{t)\uj^>0} 



Definition. The total depth available in the limit order 
book L{t), denoted by N{t), is given by 

N{t)^\Nbit)\+Na{t). 

Figure [l] shows a schematic of a LOB at some instant 
in time, illustrating the definitions in this section. The 
horizontal lines within the blocks at a given price level 
denote how the depth available at that price is made up 
from different active orders. 

Time series of prices are a common object of study in 
the empirical literature on LOBs. As discussed in Section 
|IV.H[ it is a recurring theme in these studies that the be- 
haviour exhibited by such time series depends heavily on 
how they are constructed. For example, consider the time 
series m(ii), . . . ,m(t„), for some set of times ti, . . . ,i„: 



01 
X! 



> 
fu 



CL 

01 - 
D 



mid-price 



bid-price 



Ask Side 

Sell limit order 



T.| 



ask-price 



Price 
(USD) 



spread 



Buy limit order 

Bid Side 

FIG. 1 Schematic of a LOB 



• When the ti are regularly spaced in time, with an 
elapsed time of r seconds between them, such a 
time series is said to be constructed on a r-second 
timescale. 

• When the ti are chosen to correspond to the times 
of arrivals and departures of orders from the LOB, 
the ti are irregularly spaced in time. Such a time 
series is said to be constructed on an event-by-event 
timescale. 

• When the ti are chosen to correspond to the times 
of trades (i.e., matchings in the LOB), the ti are 
also irregularly spaced in time. Such a time se- 
ries is said to be constructed on a trade-by-trade 
timescale. 



B. Price changes in limit order books 

The rules that govern matchings in LOBs dictate how 
prices evolve through time in these markets. The arrival 
of an order with a relative price A^^ > will never cause 
either of b{t) or a{t) to change. When a buy (respectively, 
sell) order with a relative price —s{t) < Aj; < arrives, 
it will become the active buy order with highest (respec- 
tively, active sell order with lowest) price in the LOB, 
and b{t) will increase (respectively, a[t) will decrease) to 
the price of this order. Whether or not the occurrence 
of a matching causes b{t) (respectively, a{t)) to change 
will depend on n{a{t),t) (respectively, n{b{t),t)) and the 
size of the incoming order that triggered the matching. 
In particular, the new bid price immediately after the 
arrival of a sell order x = {px,>.^x) with a relative price 



6 



JD 4 

^3 
S2 

t^ 

qO 

-1 
-2 
-3 
-4 



1.47 1.48 1.49 1.50 1.51 



uzr 




1.52 1.53 1.54 



1.55 Price 
(USD) 



Definition. The bid-price logarith.mic return between 
times ti and t2, denoted ri,{ti,t2), is the natural loga- 
rithm of the ratio of the bid price at time t2 and the hid 
price at time ti: ri, — log^|^^^. The ask-price loga- 

ritlrmic return between times ti and t2, ra(ti,t2), and 
the mid-price logarithmic return between times ti and 
^m{ti^t2), are defined similarly. 



FIG. 2 An example LOB 



C. Orders: the building blocks of a limit order book 





Values after event ($) 


Event 


m 


a{t) 


m{t) 


s{t) 


Initial Values 


1.50 


1.53 


1.515 


0.03 


Buy order, size 3, price $1.48 


1.50 


1.53 


1.515 


0.03 


Buy order, size 3, price $1.51 


1.51 


1.53 


1.52 


0.02 


Buy order, size 3, price $1.55 


1.50 


1.54 


1.52 


0.04 


Buy order, size 5, price $1.55 


1.50 


1.55 


1.525 


0.05 


Sell order, size 4, price $1.54 


1.50 


1.53 


1.515 


0.03 


Sell order, size 4, price $1.52 


1.50 


1.52 


1.51 


0.02 


Sell order, size 4, price $1.47 


1.48 


1.53 


1.515 


0.05 


Sell order, size 4, price $1.50 


1.49 


1.50 


1.495 


0.01 



TABLE I How each specihed market event would affect prices 
if it occurred immediately after the initial LOB state dis- 
played in Figure [2] 



< -s{t) is 



max(pa;,g), where g = arg max > \n{r^t)\ > uj^. (1) 



Similarly, the new ask price immediately after the ar- 
rival of a buy order x of size lo^^ < Q with a relative price 
A J. < —s{t) and an actual price is 



min(pa;,g), where g = arg min > n{r,t) > \(jJx\ ■ (2) 

r' ^ — ^ 

r=a{t) 

Table [l] lists a number of possible market events, along 
with the changes that they would cause to a{t),m{t), 
and s(t), if they were to occur in the LOB displayed in 
Figure [2] 

Price changes in markets are commonly studied via the 
concept of returns: 



Definition. The bid-price return between times ti and 
t2, denoted Rb{ti,t2), is the change in the bid price be- 
tween time ti and time t2 ■' Rb — ''^*^h(ti)^^'^ ■ '^^^ 
price return between times ti and t2, i?a(^i,^2); o,i^d the 
mid-price return between times ti and t2, Rm{ti, ^2); o,re 
defined similarly. 



Although LOBs provide a far greater level of flexibility 
to market participants than do quote-driven markets, the 
actions of market participants in a limit order market can 
always be expressed solely in terms of the submission or 
cancellation of orders of the lot size. For example, an 
impatient market participant who immediately sells Aa 
units of the asset being traded in the LOB displayed in 
Figure [2] can be thought of as submitting 2 sell orders 
of size a at the price $1.50, 1 sell order of size a at the 
price $1.49, and 1 sell order of size a at the price $1.48. 
Similarly, a patient market participant who posts a sell 
order of size 4a at the price $1.55 can be thought of as 
submitting 4 sell orders of size cr, each at this price. 

In almost all the published literature on LOBs, the fol- 
lowing terminology has been adopted. Orders that result 
in an immediate matching upon submission are known 
as market orders. Orders that do not, instead becoming 
active orders, are known as limit orders^ However, it 
is important to recognize that this terminology is only 
used to emphasize whether an incoming order triggers 
an immediate matching or not. There is no fundamental 
difference between a limit order and a market order. 

Some limit order markets allow impatient market par- 
ticipants to specify that they wish to submit a buy (re- 
spectively, sell) market order, without explicitly specify- 
ing a price. Instead, the market participant specifies only 
a size, and the matching algorithm sets the price of the 
order appropriately to initiate the required matching(s). 



D. The rise in popularity of limit order books 



Glosten ( 1994 1 argued that limit order markets are an 
effective way for "patient" market participants to provide 
liquidity to "less patient" market participants, even in 
situations where liquidity is scarce. He also argued that 
limit order markets are immune to competition by other 
exchange set-ups (such as quote-driven marketplaces, as 
described above) . Luckock ( 2003 ) concluded that the vol- 
ume of trade in a limit order market would always exceed 



* Some practitioners use the terms "aggressive orders" and "resting 
orders" , respectively, but this terminology is far less common in 
the pubhshed literature. 



7 



that of a Walrasian market p] given t he same underlying 
supply and demand. Foucault et al. (2005) argued that 



information about the LOB that was made available to 



the popularity of limit order markets was due in part to 
their ability to allow impatient market participants to 
demand "immediacy" , while simultaneously allowing pa- 
tient market participants to supply it to those who will 
later require it. 



III. LIMIT ORDER BOOK CHALLENGES 

In this section, we discuss some of the challenges that 
LOBs present researchers. In particular, we discuss tech- 
nical issues associated with the study of empirical LOB 
data, and present several challenges inherent in modelling 
a LOB within the general framework set out in Section 



IE 



A. Different perspectives 

In order to construct a useful model of a LOB, certain 
assumptions must inevitably be made. One such assump- 
tion regards the reason that order flows exist at all. In 
much of the mainstream economics literature, order sub- 
mission is motivated by the assumption that "perfectly 
rational" agents attempt to maximize their "utility" by 
making trades in markets driven by "information" (Par- 



lour and Seppi 2008). However, this methodology has 



come under increasing scrutiny. For example, |Gode and] 
Sunder ( 1993 ) highlighted how utility maximisation is 



often inconsistent with direct observations of individual 
behaviour, and Smith et al. (20031 noted that the fre- 



quency with which existing active orders are modified is 
far too low to account for the arrival of every new piece of 
unanticipated information. Therefore, perfect rationality 
must be viewed as an extreme modelling assumption that 
is made in order to provide a framework within which 
calculations can be performed. 

At the other extreme lies the zero-intelligence ap- 
proach, in which aggregated order flows are assumed to 
be governed by a specified stochastic process, rather than 
decomposed into their constituent actions and motivated 
by individual agents attempting to maximize some per- 



sonal utility function (Cont et al. 2010 Daniels et al. 



2002 Smith et al. 2003). Much like perfect rationality. 



the zero-intelligence approach is an extreme simplifica- 
tion that is inconsistent with empirical observations. For 



example, Boehmer et al. ( 2005 ) found detectable changes 



in order flows when the NYSE increased the amount of 



^ A Walrasian market is one in which all market participants send 
their desired buy or sell orders to a specialist, who then deter- 
mines the market value of the asset by selecting the price that 
will maximise the volume of trade. 



market participants in real time. Furthermore, Bortoli 



et al. (2006) noted that when the Sydney Futures Ex- 



change implemented a similar change, the depth avail- 
able at the best prices became smaller and market orders 
for quantities larger than those offered at the best prices 
became more frequent]^ This suggests that market par- 
ticipants were using such information when deciding how 
to interact with the LOB, and were therefore not acting 
with zero intelligence. However, zero intelligence has the 
appeal of being an easily quantifiable concept, and leads 
to falsifiable predictions that are testable without the 
need for a series of auxiliary assumptions. It is, there- 
fore, a useful starting point for building models[^ 

Between the two extremes of perfect rationality and 
zero intelligence lies a broad range of other approaches 
that make weaker assumptions about market partici- 
pants' behaviour and order flows, at the cost of result- 
ing in models that are more difficult to study. Many 
such models rely exclusively on Monte-Carlo simulation 
to produce output. Although such simulations still moti- 
vate interesting observations, it is often difficult to trace 
exactly how specific model outputs are affected by the 
input parameters, whereas in analytical work such de- 
pendence is explicit throughout. 



B. State-space complexity 

It is a well-established empirical fact that order flows 
depend on both L{t) and on recent order flows (Bi- 



2006 



ais et al.\ p95{ |Ellul et al.\ [20031 |Hall and Hautsch 



Hollifleld et all [20041 [Lo" and Sappj [20T0l [Sandas 



2001 ). From a perfect-rationality perspective, this can be 
thought of as market participants reacting to the chang- 
ing state of the market; from a zero-intelligence perspec- 
tive, this can be thought of as order flow rates depending 
on L{t) and on recent order flows. From either perspec- 
tive, a key task is to uncover the structure of such condi- 
tional behaviour, either to understand what information 
market participants evaluate when making decisions or 
to quantify the conditional structure of order flows. 

However, the state space is huge: if there are P differ- 
ent choices for price, the state space of the depth profile 
is Z^. This makes it very difficult to investigate such de- 



^ Several other studies have investigated such changes (see, e.g., 
(Boehmer et aZ.] [2005] [Madhavan et a;.| |2005| |Mizrach| |2008|)), 
but they have all been based on hybrid limit order markets in 
which (even before the change) some market participants had 
access to more information about the LOB than others. 
We explore in Section ^ how some authors have attempted to 
quantify perfect rationality for modelling purposes and discuss 
the often highly unrealistic assumptions that such formulations 
require in order to be empirically tested. A more detailed treat- 
ment can be found in ( [Foucault et aL[ |2005[ l . 



8 



pendences, as the number of variables upon which to con- 
dition is so large. Therefore, a key task in LOB modelling 
is to find a way to simplify the evolving, high-dimensional 
state space, while retaining the important features. Al- 
though some suggestions for such dimensionality reduc- 



tion have been made (see, e.g., ( 


Cont and de Larrard 


2011 


Eliezer and Kogan 


1998 Smith et al. 


2003)), there 



is no consensus about a simplified state space upon which 
a very general LOB model can be constructed. 



C. Feedback 

In addition to market participants' actions depending 
on L{t), it is also clearly true that the state of the LOB 
L(t) is entirely dependent on market participants' ac- 
tions. These two mutual dependences form a feedback 
loop between L{t) and market participant behaviour that 
makes LOB modelling very difficult. 



certainty associated with market orders versus the wait- 
ing and uncertainty associated with limit orders. 



Copeland and Galai ( 1983 1 noted that a limit order 



can be thought of as a derivative contract written to the 
whole market, via which the order's owner offers to buy 
or sell the specified quantity of the asset at the speci- 
fied price to any market participant wishing to accept. 
For example, if market participant A submits a sell limit 
order x = {px,ujx), this is equivalent to A offering the 
entire market a call option to buy units of the asset 
at price p^, so long as the order is active. Market partic- 
ipants offer such derivative contracts - i.e., submit limit 
orders - in the hope that they will be able to trade at 
better prices than if they simply submitted market or- 
ders. However, whether or not such a contract will be 
accepted by another market participant (i.e., whether or 
not the limit order will eventually become matched) is 
uncertain. 



D. Coupling between b{t) and a{t) 



As described in Section II. C b{t) determines the 
boundary condition for sell limit order placement because 
any sell order placed at or below b{t) will at least par- 
tially match immediately. A similar role is played by a{t) 
for buy orders. There is a strong coupling between 6(t) 



and a(t). [Smith et al. (2003) observed how this nonlin 



ear coupling makes modelling the LOB such a difficult 
problem. 



E. Quantifying patience 

In a LOB, both "patient" and "impatient" market par- 
ticipants experience benefits and drawbacks for their ac- 
tions. Patient market participants stand a chance of ex- 
ecuting their trades at a better price than do impatient 
ones, but they also run the risk of their orders never be- 
ing matched and of future adverse selection. Conversely, 
impatient traders never trade at prices better than b{t) 
and a{t), but they do not face the inherent uncertainty 
associated with placing orders that do not match im- 



mediately. Foucault et al. (2005) conjectured that arbi- 



trageurs, technical traders, and indexers were most likely 
to place impatient orders (due to the fast-paced nature 
of their work) and that portfolio managers were most 
likely to place patient orders (because their strategies 
are generally more focused on the long term). In real- 
ity, many market participants use a combination of both 
patient and impatient strategies in their interaction with 
the LOB, selecting their actions for each trade based on 



their individual needs at that time ( Anand et al. 2005 1 . 
The bid-ask spread s{t) may be considered to be a mea- 
sure of how highly the market values the immediacy and 



F. Priority 

As shown in Figure [l] it is possible to have active or- 
ders owned by different market participants at the same 
price at any given time. Much like priority is given to ac- 
tive orders with the best (i.e., highest buy or lowest sell) 
price, limit order markets also employ a priority system 
for active orders within each individual price level. 

By far the most common priority mechanism currently 
used is price-time. That is, priority is first given to those 
active orders with the best price, and ties are broken by 
selecting the active order that was placed first among 



those. As Parlour (1998) highlighted, price-time priority 
is an effective way to encourage market participants to 
place limit orders in the LOB. Without a priority mech- 
anism based on time, there is no incentive for a market 
participant to "show their hand" by submitting limit or- 
ders at any time before the desired price becomes the 
best price. 

Another priority mechanism is pro-rata, which is com- 



monly used in futures markets (Field and Large 2008). 
Under this mechanism, when a tie needs to be broken, 
each active order participating in the tie-break receives a 
matching proportional to the fraction of the depth avail- 
able that it represents at the best price. For example, if 
a buy market order of size 3cr arrived at the LOB dis- 
played in Figure [3j ti of it would match to active order 1 
and 2a of it would match to active order 2, because they 
correspond to 1/3 and 2/3 of the depth available at a{t), 
respectively. 

Market participants trading in a pro-rata priority mar- 
ket are also faced with the substantial difficulty of opti- 
mally selecting limit order sizes, as posting limit orders 
with larger sizes than the quantity that is really desired 
for trade becomes a viable strategy to gain priority until 
a sufficiently large quantity has been matched, at which 



9 



Limit order 1 



Limit order 2 



1.50 



1.51 




Iceberg order 



1.4631 1.4632 




1.4633 1.4634 1.4635 



1.4636 Price 
(USD) 



FIG. 3 Illustration of a LOB with pro-rata priority 



time any remaining orders can be cancelled. 

Another alternative priority mechanism is price-size, 
in which ties are broken by selecting the active order of 
the largest size among those limit orders with the best 
price. Until recently, no major exchanges used this pri- 
ority mechanism, but in October 2010 the first price-size 
trading platform, NASDAQ OMX PSX, was launched 



FIG. 4 Illustration of a LOB containing iceberg orders. 
Market participants would only see a depth available of 
n(1.4634,t) = 2(7, not 4cr. 



(NADSAQ 2010). Some exchanges allow market partic- 
ipants to specify a minimum match size when submitting 
orders. Other orders with a size smaller than this are not 
considered for matching against such orders. This may 
be considered to be a "pseudo-price-size" priority mech- 
anism: small active orders are often bypassed, effectively 
giving higher priority to larger orders. 

Different priority mechanisms encourage market par- 
ticipants to behave in different ways. Price-time priority 
encourages market participants to act quickly to ensure 
that their active orders sit at the front of the priority 
queue for their chosen price, and price-size and pro-rata 
priority reward traders for placing large limit orders and 
thus for providing greater liquidity to the market. 

Because market participants' behaviour is closely re- 
lated to the priority mechanism used in the specific mar- 
ket, LOB models need to take priority mechanisms into 
account when considering order fiow. Furthermore, in 
models that attempt to track specific orders, priority 
plays a pivotal role. 



G. Incomplete sampling and hidden liquidity 

The LOB L{t) reflects only the subset of trading inten- 
tions that market participants have announced up to time 
t. However, the fact that a market participant hasn't 
announced a desire to trade at a given price does not 
mean that they do now want to do so, as they may be 
keeping their intentions private from other market par- 
ticipants by only submitting orders when absolutely nec- 
essary. Unsurprisingly, quantifying market participants' 
undisclosed trading intentions presents substantial diffi- 
culties when building models. 



Bouchaud et at. ( 2009 1 highlighted that a typical snap- 
shot of a LOB at a given time is very sparse, containing 
some active orders that are close to h{t) and a(i), but 
also active orders that are very far from these prices. As 
described above, this cannot be interpreted as an indi- 
cation that few people wish to trade at prices far from 
h{t) and a{t), however; it is merely an indication that 
they have not announced any intention to do so. Indeed, 
some market participants choose not to submit limit or- 
ders at alllf] These traders instead watch how the values 
of h{t) and a{t) evolve with time and place market orders 
when certain criteria are met. 

Furthermore, many exchanges now allow market par- 
ticipants to conceal the extent of their intentions to trade, 
often at the cost of paying some penalty in terms of pri- 



ority or price. As numerous authors (e.g., ( 


Biais et al. 


1995 


Bouchaud et al. 2009 Hollifield et al. 


20041) have 



highlighted, this poses significant problems when build- 
ing models of LOBs. 

A common example of how market participants can 
hide the extent of their intentions to trade is by using 
iceberg orders (also known as hidden-size orders). An 
iceberg order is a type of limit order that specifies not 
only a total size and price but also a visible size. Other 
market participants then only see the visible size. Rules 
regarding the treatment of the "hidden" quantity vary 
greatly from one exchange to another. In some cases, 
once a quantity of at least the visible size is matched to 
an incoming market order, another quantity equal to the 
visible size becomes visible, with a time priority position 
equal to that of a standard limit order placed at this time. 
This sort of iceberg order is similar to a market partici- 
pant watching the market very carefully and submitting 
a new limit order at the same price and size at the exact 
moment that their previous limit order is matched to an 



* Arbitrageurs are a key example of this. Their strategies depend 
on simultaneously buying and selling in an attempt to make in- 
stant profit. Limit orders are of little use to them because it is 
uncertain when (if ever) they will be matched. 



10 



incoming market order. A market participant acting in 
this way is sometimes deemed to be constructing a syn- 
thetic iceberg order. The only difference between a syn- 
thetic iceberg order and a genuine iceberg order occurs 
when a market order with a size larger than the (visible) 
depth available at the best price arrives. In this situation, 
the market order matches to any visible portions of active 
orders at the best price (according to the usual priority 
rules) and then a portion of any hidden depth available at 
this price. By contrast, if a market participant was sub- 
mitting small but entirely visible duplicate limit orders 
immediately after their previous orders were matched, a 
large incoming market order would be matched only to 
the active orders that existed at that time, and the rest 
of the incoming market order would instead be matched 
to the active orders at the next best price. For example, 
if the sell order at $1.4634 highlighted in Figure[4]were an 
iceberg order with a visible size of 2a and a hidden size of 
2(7, then an incoming buy market order of size Aa would 
have all 4(t matched at $1.4634, resulting in an ask-price 
of $1.4635 immediately after the matching. However, if 
the highlighted sell order were a standard limit order of 
size 2(7, whose owner was adopting the strategy of resub- 
mitting a similar limit order of size 2a immediately after 
the original order was matched, then 2a of the incom- 
ing market order would be matched at $1.4634 and 2a of 
the incoming market order would be matched at $1.4635, 
followed by the new (duplicate) limit order repopulating 
the book at $1.4634, resulting in an ask-price of $1.4634 
immediately after the submission of the limit order. 

Some exchanges have an alternative structure for ice- 
berg orders. Whenever a quantity equal to at least the 
visible size of an iceberg order is matched to an incoming 
market order, the rest of the order (i.e., the portion of the 
hidden component that is not also matched to the same 
incoming market order) is cancelled. In this way, market 
participants with an active iceberg order at a given price 
can match incoming market orders of a larger size than 
is initially apparent, without revealing to the market the 
true extent of his/her desire to trade (because only the 
visible portion of the order is displayed in the LOB) , but 
otherwise the iceberg order behaves like any other order. 
This is the system currently used by the Reuters trading 



market participants. Rules governing matchings in dark 



pools vary greatly from one exchange to another ( Mittal 



platform ( Thomson-Reuters 2011) 



Some other trading platforms (e.g., Currenex and 
Hotspot FX (Gould et al. 2011)) allow entirely hidden 
limit orders. These orders are given priority behind both 
entirely visible active orders at their price and the visible 
portion of iceberg orders at their price, but they give mar- 
ket participants the ability to submit limit orders without 
revealing any information whatsoever to the market. 

Recently, there has also been an increase in the pop- 



ularity of so-called dark pools (see, e.g., (Carrie 2006 



Hendershott and Jones 2005)), particularly in equities 



2008). Some dark pools are essentially LOBs in which 



all limit orders are entirely hidden; whereas other dark 
pools do not allow market participants to specify prices 
at all. Instead, they state only the size of their order and 
whether they wish to buy or sell. Orders are held in a 
time priority queue until they are matched to orders of 
the opposite type, and any trades that occur do so either 
at the mid-price m(t) of some other specified standard 
(i.e., non-dark) LOB in which the same asset is being 
traded, or at a price that is later negotiated by the two 
market participants involved. 

Even in LOBs with no hidden liquidity, market par- 
ticipants are not always able to view the entire LOB in 
real time. In many exchanges, only limit orders that lie 
within a certain range of relative prices are displayed. 
On some electronic trading platforms, updates to L{t) 
are only transmitted by the exchange with a given fre- 
quency, meaning that all activity that has taken place 
since the most recent refresh signal is invisible to mar- 
ket participants. As discussed in Section |III.A[ market 
participants' actions have been found to vary with the 
amount of the LOB that they are able to view in real 



time (Boehmer et al. 2005 Bortoh et al. 20061. 



H. Volatility 



Loosely speaking, volatility is a measure of the vari- 



ability of returns of a traded asset ( Barndorff-Nielsen 



and Shephard 2010). There are many different measures 



of volatility commonly calculated from financial time se- 
ries, and the exact form of volatility that is studied in 
a given situation depends heavily on the type of data 
that is available and the desired purpose of the calcu- 
lation (Shephard 2005). Even when their estimation is 



based on the same data, different measures of volatility 
sometimes exhibit different properties. For example, us- 
ing data from a wide range of different markets, different 
measures of volatility have been found to follow different 



trading. In a dark pool, no information about mar- 
ket participants' trading intentions is available to other 



intra-day patterns; see (Cont et al. 20111 and references 
therein. For this reason, many empirical studies report 
their results using more than one measure of volatility. 

The volatility of an asset provides some indication of 
how risky investing in it is. All other things being equal, 
an asset with higher volatility should be expected to un- 
dergo larger price changes over a given time interval than 
an asset with lower volatility. For market participants 
who wish to carefully manage their risk exposure, volatil- 
ity is a primary consideration when deciding which assets 
to invest in, and therefore forms the basis of optimal port- 
folio construction in the traditional economics literature. 

As discussed in Section |IV.F[ links between volatility 
and several other important properties have been empir- 
ically observed in a wide range of LOBs. 



11 



1. Model-based estimates of volatility 

A difficulty that arises when estimating any measure 
of volatihty is that in a LOB, many market participants 
submit then immediately cancel buy (respectively, sell) 
limit orders within the spread. This can occur for a va- 
riety of reasons, but often it is the result of electronic 
trading algorithms searching for hidden liquidity. This 
causes b{t) (respectively, a{t)) to fluctutate without any 
trades occuring, and may be considered micro structure 
noise rather than a meaningful change in price. One 
way to address this problem is to assume that the ob- 
served data is governed by a model from which an esti- 
mate of volatility that is absent of microstructure noise 
can be derived. The parameters of the model are then 
estimated from the data, and the volatility estimate is 
derived explicitly from the model. However, a drawback 
of this method is that the estimate of volatility that is 
obtained depends heavily on the model used, and models 
that poorly mimic some important aspect of the trad- 
ing process may therefore output misleading estimates of 
volatility. For this reason, we restrict our attention to 
model-free estimates of volatility. 

2. Model-free estimates of volatility 



There is an extensive literature about using histori- 
cal LOB data to perform model-free estimates of volatil- 



ity (see, e.g., ( 


Ait-Sahaha et al. 2005 


Andersen and 


Todorov 


2010 


Bandi and Russell 


2006 


Zhou 


19961). 



In this section, we introduce three methods for perform- 
ing such estimates. 



Definition. Given the bid-price time series 
b{ti),b{t2), ■ ■ ■ ,b{tn) sampled at a regularly spaced 
set of times ti, t2, . . . , t„, the bid-price realized 
volatility, denoted Vii{ti,t2, ■ ■ ■ ,tn), is the stan- 
dard deviation of the set of logarithmic returns 
{rb(ti,ti+i) I i = 1, 2, . . . , n — 1}. The ask-price re- 
alized volatility, denoted Va{ti,t2, ■ ■ ■ ,tn), and the 
mid-price realized volatility, denoted Vmiti,t2, ■ ■ ■ ,tn), 
are defined similarly. 

Realized volatility is a useful measure for comparing 
the variability of return series that are sampled with the 
same fixed, specific frequency. For example, by using 
the daily closing mid-price of each of two stocks over the 
same year, a comparison of the realized volatility of each 
would provide insight into which stock's mid price var- 
ied more, day-on-day, in the given year. A similar com- 
parison could be made between the mid price of the two 
stocks, recorded at the start of each second during a given 
trading day, to provide some insight into the relative size 
of the stocks' volatility during that day. However, re- 
alized volatility depends on the frequency at which the 
price series is sampled, so it is not appropriate to com- 



pare the realized volatility of a once-daily price series for 
one stock to a once-hourly price series for another. 

Definition. Given the bid-price time series 
b{ti) , b{t2) T ■ ■ ■ , b{tn) sampled at the set of times 
ti,t2,...,tn at which n consecutive sell market orders 
arrive, the bid-price realized volatility per trade, denoted 
Vh{ti,t2, ■ ■ ■ ,tn), is the standard deviation of the set of 
logarithmic returns {rb{ti, ti+i) | i — 1,2, . . . ,n ~ 1}. 
The ask-price realized volatility per trade, denoted 
Va(ti,t2, ■ ■ ■ ,tn) , is defined similarly, using n consec- 
utive buy market order arrival times for ti,t2, . ■ ■ ,tn. 
The mid-price realized volatility per trade, denoted 
Vm{ti,t2, ■ ■ ■ ,tn), is defined similarly, using n consecu- 
tive market order arrival times (irrespective of whether 
they are buy or sell market orders) for ti,t2, ■ ■ ■ ,tn- 

Realized volatility per trade is a useful measure for 
comparing how prices vary on a trade-by-trade basis. 

Definition. For a given trading day D, the bid-price 
intra-day volatility, denoted Pb{D), is the logarithm of 
the ratio of the highest bid price during day D and the 
lowest bid price during day D: 

iT-,\ 1 /niaxtgD6(t)\ 

Pb{D) = log — — . 

V mmtgD b{t) J 

The ask-price intra-day volatility over trading day D , de- 
noted pa{D), and the mid-price intra-day volatility over 
trading day D, denoted pm{D), are defined similarly. 

Intra-day volatility is a useful measure of how likely 
very large price swings are in a given day. It is partic- 
ularly important for day traders, who buy or sell assets 
in the market but unwind their positions before the end 
of each trading day. By knowing the intra-day volatil- 
ity of an asset over a number of recent days, day traders 
can manage their exposure to the risk of a large price 
movement in a given day. 

3. Time window effects 

In order to estimate how the volatility of a given asset 
varies through time, it is common to use a rolling time 
window approach to estimate a volatility series. More 
precisely, the volatility of the asset is estimated over some 
given time window, then the time window is advanced by 
some predetermined amount. This process is repeated 
multiple times, so as to span the desired time range. 

Figure [5] illustrates several points about using rolling 
time windows to estimate volatility series. First, the esti- 
mate of volatility is substantially larger whenever a spike 
in the original time series falls within the estimation time 
window. Using longer time windows causes such spikes 
to remain within them for a longer period of time, and 
therefore causes the volatility series estimate to be large 



12 



E 
CO 



simulated mid-price logarithmic return series 



200 



400 



600 



800 



1000 



Time (seconds) 



Roiiing-window estimate of mid-price realized volatiiity, 
window length 50 



C\] 

d 



200 400 600 800 
Window start time (seconds) 



Rolling-window estimate of mid-price realized volatility, 
window length 100 



C\J 

d 



200 400 600 800 

Window start time (seconds) 

FIG. 5 Simulated mid-price logarithmic return series and two 
examples of corresponding mid-price realized volatility series 



for longer. Second, longer time windows contain more 
data points, so the contribution to the volatility estimate 
made by any single point within the window is smaller. 
This means that a given spike in the original time series 
results in a smaller increase in the estimate of volatil- 
ity when a longer time window is used. Third, longer 
time windows lead to smoother estimated volatility se- 
ries, with or without spikes, because more observations 
are averaged within each window. For further discussion 



I. Resolution parameters 

To attract market participants, many new trading plat- 
forms offer smaller values of a or 5p than those offered 
by older trading platforms. Furthermore, values of a 
and 5p vary greatly from market to market. Expensive 
shares are normally traded with a — 1 share; cheaper 
shares are often traded with cr 3> 1 share. In many for- 
eign exchange markets, the currency pair XXX/YYY is 
traded with a = I million units of XXX, in others it is 
traded with as little as ti = 0.01 units of XXX0 In eq- 
uity markets, Sp is often chosen to be around 0.01% of 
the mid-price m(t), rounded to the nearest power of 10. 
For example, Apple Inc. stock's m{t) fluctuated in the 
range of $150 to $350 dollars during the year 2010, during 
which time it traded with Sp — $0.01. Common examples 
of Sp in foreign exchange markets are around 0.001% of 
the mid-price m(t), again rounded to the nearest power 
of 10. For example, on the electronic trading platform 
Hotspot FX, Sp is $0.00001 for GBP/USD trades (where 
m{t) fluctuated in the range $1.40 to $1.70 during 2010) 
and 0.001 yen for USD/JPY trades (where m{t) fluctu- 
ated in the range 80 yen to 100 yen during 2010). 

It is a recurring theme in the empirical literature (see. 



e-g-, ( 


Biais et al. 


1995 


Foucault et al. 


2005 


Seppi 1997 


Smith et al. 


2003 


l) that a market's resolution parameters 



a and Sp greatly affect the way in which its market par- 
ticipants trade. Empirical regularities that are present 
in one market may, therefore, not be present in another, 
making the task of building a single universal LOB model 
very difficult. 



J. Bilateral trade agreements 

On some exchanges, each market participant maintains 
a trade agreement list of other market participants with 
whom they are willing to trade. Under such a setup, 
a trade can only occur between market participants A 
and B ii A appears on B's trade agreement list, and 
vice-versa. The exchange shows each market participant 
a personalized version of the LOB that contains only 
the active orders owned by the market participants with 
whom it is possible for them to trade. When a market 
participant submits a market order, it will only match to 
an active order that has been displayed to that specific 
market participant, bypassing any higher priority active 



of these issues in a practical context, see Liu et al. { 1999 1. 



^ In foreign exchange markets, the XXX/YYY LOB is used to 
match exchanges of currency XXX to currency YYY. A price 
in the LOB XXX/YYY denotes how many units of currency 
YYY are being exchanged for a single unit of currency XXX. 
For example, a trade at the price $1.52342 in the GBP/USD 
market would correspond to 1 GBP (i.e., pounds sterling) being 
exchanged for 1.52342 USD (i.e., US dollars). 



13 



orders from market participants with whom it is not pos- 
sible for them to trade. 

In exchanges operating such bilateral trade agreements, 
it is possible for a buy (respectively, sell) market order 
to bypass all active orders at the globally lowest (respec- 
tively, highest) price available in L[t), and match to an 
active order with a strictly higher (respectively, lower) 
price. Furthermore, it is possible for L{t) to simulta- 
neously contain both an active buy order x — {pxt^^x) 
owned by a market participant A and an active sell or- 



der y = [py 



owned by a market participant B, with 



Py 1^ Px, without a matching occurring, if A and B do 
not have a bilateral trade agreement. Therefore, it is 
possible for s{t) < 0. 

All of these factors make modelling of specific match- 
ings and of the evolution of L{t) a very difficult task 
in LOBs that operate with bilateral trade agreements. 



Gould et al. (2011) presents a full discussion of these is- 



sues, so we do not consider such LOBs further here. 



K. Opening and closing auctions 

Many exchanges suspend standard limit order trading 
at the beginning and end of the trading day, instead us- 
ing an auction system to match orders. For example, the 
LSE's flagship order book SETS ( |SETS[ [20TT| ) has three 
distinct trading phases in each trading day. Between 
08:00 and 16:30, the standard LOB mechanism is used, in 
a period known as "continuous trading". Between 07:50 
and 08:00, a 10 minute "opening auction" takes place, 
and between 16:30 and 16:35, a 5 minute "closing auc- 
tion" takes place. Both auctions use the same rules. Dur- 
ing these periods, all market participants can view and 
place orders as usual, but no orders are matched. Due to 
the absence of matchings, the highest price among buy 
orders is allowed to (and often does) exceed the lowest 
price among sell orders. All orders are stored until the 
opening auction period ends. At this time, for each price 
p at which there is non-zero depth available, the trade 
matching algorithm calculates the number Cp of trades 
that could occur by matching buy orders with a price 
greater than or equal to p to sell orders with a price less 
than or equal to p. The uncrossing price: 



p = arg max C„ 

p 



(3) 



is calculated, and all relevant matchings occur at this 
price. Crucially, and in contrast to standard limit order 
trading, all trades take place at the same uncrossing price 
p, irrespective of what price the original orders specified. 
There are specific rules for choosing p if multiple values 
of p solve equation Once the uncrossing price p has 
been determined, if there is a smaller depth available for 
sale than there is for purchase (or vice versa), ties are 
broken using time priority. 



Throughout the opening auction, all market partici- 
pants can see what the values of p and Ap would be if 
the auction were to end at that moment. The purpose of 
the opening auction is to allow all market participants to 
observe the "discovery" of the price without any match- 
ings taking place until the process is complete. Such a 
price discovery process is common to many marketsrj 



L. Power laws 

Several LOB properties have been reported to have 
power-law tails: 

Definition. A random variable Z is said to have a 
power-law tail with exponent a if, in the limit z ^ oo, 
there exists some a G M such that its probability density 
function fz{z) decays like i.e., fz{z) ^ 0(z^"). 



If there exists some finite 



is proportional to 



> such that fz{z) 



for all z > Zmin, then clearly Z 



has a power-law tail. Although this is far from being the 
only possible probability density function to describe a 
power-law tail, the existence of such a Zmin allows simple 
closed-form expressions to be derived for when Z > z,„in 
(Clauset et al. 2009). In particular, if Z is a contin- 
uous random variable for which fz{z) = kz~°' for all 
z > Ziuin, with a > 1, then the probability density func- 
tion fz {z\Z > Zmin) describing the distribution of Z re- 
stricted to the region Z > Zmin can be calculated explic- 
itly by noticing that the integral of fz (z\Z > Zmin) over 
its domain must equal 1, i.e.. 



fz iz\Z > Zn 



kz-" = 1. 



Therefore: 



Yielding: 



-a+l ' 



fz {z\Z > Zinin) 



, Z> Z„ 



(4) 



(5) 



(6) 



When attempting to fit power-law tails to empirical ob- 
servations, it is often assumed that fz {z\Z > Zaun) takes 



[Biais et al.\ | |1999| performed a formal hypothesis test on price 
discovery data from the Paris Bourse. Working at the 2.5% level, 
they did not reject the null hypothesis that market participants 
"learned" (i.e., that their conditional expectations approached 
the market value of the asset) during the final 9 minutes of the 
price discovery process. However, they found that during the 
early part of the price discovery process, market participants' 
actions were not significantly different from noise. 



14 



this functional form, in order to make use of such closed- 
form expressions in the inference process. Under this as- 



sumption, Clauset et al. ( 2009 ) provides a comprehensive 
algorithm for consistent estimation of a and Zmin directly 
from empirical data via maximum likelihood techniques, 
and for formally testing the hypothesis that data really 
does follow a power law. Several other consistent esti- 



mation procedures also exist (e.g., (Hill 1975 Mu et al. 
1^009)), but no single estimator has emerged as the best 
to adopt in all empirical analyis, where the number of 
samples is inevitably finite. For this reason, some empir- 
ical studies report the estimates made by several differ- 
ent estimators, then draw inference about a based on the 
whole set of results. However, as Mu et al. (2009) high- 
lighted, in many situations different estimators produce 
vastly different estimates of a, making such inference dif- 
ficult. 

Despite their prominence throughout the scientific lit- 
erature, doubt has recently been cast over the valid- 



ity of many reported power-laws (Clauset et al. 2009 



Stumpf and Porter 2012). When assessing whether em- 



pirical data might follow a power law, it is crucial to per- 
form goodness-of-fit and likelihood ratio tests, in order 
to provide a formal assessment of the fit and to compare 
it against other candidate distributions. [Clauset et al.\ 
( 2009 ) noted that the evidence supporting many reported 



power-laws in empirical studies consisted of little more 
than the observation of an approximately straight line 
on a log-log plot of the data. Furtermore, they demon- 
strated that estimating a by performing a linear least- 
squares regression on such a log-log plot led to significant 
systematic errors, as several of the basic assumptions re- 
quired to apply least-squares regression did not apply. 
However, many of the power laws reported in empiri- 
cal studies of LOBs are not accompanied by any such 
goodness-of-fit or likelihood ratio tests, and many au- 
thors explicitly state that their estimation of the power- 
law exponent was performed via linear least-squares re- 
gression on a log-log plot. Furthermore, when working 
with power laws, estimators and test statistics take dif- 
ferent functional forms for discrete data than they do for 
continuous data (Clauset et al. 2009). Using the esti- 
mators derived in the continuous case on discrete data 
leads to substantial systematic errors, yet many empiri- 
cal publications do precisely this. 



M. Long-Range Correlations 



As discussed in Section [TV.H[ a wide variety of time se- 
ries related to LOBs have been reported to exhibit long 
memory: i.e., long-range autocorrelations. If the exact 
correlation structure is known (or well estimated) and a 
long history of the series has been observed, such long- 
range autocorrelations provide substantial benefits when 
forecasting future values of a time series, as the autocor- 



relation structure makes future values more predictable. 
However, estimation of the properties of such time series 
is laden with technical and practical difficulties that are 
rarely acknowledged in the empirical literature. |Beran| 
( 1994 ) surveyed a wide range of technical results regard- 



ing the convergence of estimators for time series with 
long memory; here we focus on the practical challenges 
of estimating long-range autocorrelations, paying careful 
attention to the associated difhculties. 

Throughout this section, we denote by X 
a second-order stationary time series X = 
X{ti),X{t2), . . . , X(tfc)p^ and define autocorrelation as 
follows: 

Definition. The lag-Z autocorrelation of a time series 
X is given by 

k-l 

Ax (0 - r37 E (^(*^) - (^)) (^(^^+') - (^)) ' 



whe 



{X) = X)i=i -^i^i) time-averaged value. 



When considered as a function of /, Ax is called the 
autocorrelation function. For clarity, we initially restrict 
our attention to processes with positive long-range au- 
tocorrelations. Time series with negative long-range au- 
tocorrelations certainly exist, but the discussion of their 
properties becomes more cluttered due to ± sign consid- 
erations. Negative long-range autocorrelations are rein- 
troduced from Section lill.M. 31 onwards. 



1. Long- and short-memory processes 

Definition. The time series X is said to exhibit short 
memory if in the limit I oo, the autocorrelation 
function Ax decays exponentially in I; i.e., Ax{l) ~ 
O (e-'/^) , / ^ oo. 

For such short-memory processes, r is an indication of 
the number of time steps over which X is appreciably 
autocorrelated. An example of a short-memory process 



is an autoregressive process (Hamilton 1994). 



Definition. The time series X is said to exhibit long 
memory if in the limit I ^ oo, the autocorrelation func- 
tion Ax decays like a power law in I; i.e., Ax{l) ~ 

O [1^°') ,1 ^ oo, where < a < 1. 

The exponent a describes the strength of the long 
memory: the smaller the value of a, the longer the mem- 
ory of the process. 



A time series is second-order stationary if its first and second 
moments are finite and do not vary with time. For a discussion 
of issues regarding stationarity in financial time series, see |Taylor| 
(2008 j l. 



15 



2. Practical considerations 

The above definitions present difficuhies in practice. 
First, both definitions deal only with asymptotic be- 
haviour, and do not specify autocorrelations at any fi- 
nite I. Clearly, any empirically-observed time series is 
finite, so judgement must be made as to when, or, in- 
deed, whether, it is appropriate to judge the autocorre- 
lation function as approaching its asymptotic behaviour. 
Second, the definitions deal only with rates of conver- 
gence. Autocorrelations may themselves be arbitrarily 
small, making their estimation very difficult. Third, for 
a long-memory process, values from the distant past can 
have a statistically significant impact on values in the 
present. Therefore, the estimation of parameters and 
their confidence intervals is a difficult task, as the long- 
range autocorrelation causes the effective sample size to 



be far smaller than it might initially seem (Farmer and 



Lillo 2004). 



If samples are independent and identically distributed, 
the variance of many estimators scales with the sample 
size A: at a rate O (jt^^^. However, in the presence of 
long memory, the variance of many such estimators scales 
with the sample size fc at a rate slower than O 5 ^ ; 
meaning that a very large sample is required in order 
to make good estimates. In many cases, if it is errane- 
ously assumed that samples from a long-memory process 
are actually uncorrelated, the probability that a stan- 
dard confidence interval of an estimate contains the true 
value of the parameter tends to as fc — >■ oo (Beran 



1994). Therefore, modifications to such calculations must 



be made in order to achieve sensible confidence intervals 
that take the long-range autocorrelations into account. 

For these reasons, along with the potential existence 
of noise or trends in X , direct estimation of a from the 
autocorrelation function Ax{l) often produces very poor 



results (Lillo and Farmer 2004). Although log- log plots 



of Ax can be useful as a preliminary visual tool to infor- 
mally assess whether X might have long memory, such 
plots are of little use beyond this. Instead, a variety of 
techniques for obtaining better estimates of the strength 
of long memory have been developed, as we now discuss. 



3. The Hurst exponent 

Given X, define the new time series Y as the partial 
sums of X: 

i 

Y(U) = Y,X{t,),i^l,...,k. 

In this way, Y can be thought of as the random walk 
whose jump at time ti is given by X{ti). It can be 
shown that the standard deviation of Y{ti+i) — Y{ti) 



scales asymptotically like Z^, for some H. This follows 
from the fact that 

i+l i i+l 

Y{U+i)-Y{ti)=J2x{t,)-J2x{t,)= J2 ^ih)^ 

j=l j=l 

(7) 

then using the asymptotic properties of the sums of such 
X{ti). H is known as the Hurst exponent of X, and is 
related to a, discussed above, according to 



a 



(8) 



Therefore, H also provides information about the 
strength of long memory in a time series. If X has a 
Hurst exponent of | , then it does not display long mem- 
ory. If X has a Hurst exponent ^ < H < 1, then X has 
long memory with positive long-range autocorrelations. 
It is a recurring mistake in the literature that if X has 
Hurst exponent | < H < 1, its unconditional distribu- 
tion f{X{ti)) must exhibit heavy tails 



Preis 



However 

et al. ( 2006[ 20071 showed that such an implication does 
not hold in general. 

Time series with negative long-range autocorrelations 
can also be classified using their Hurst exponent. In par- 
ticular, if X has Hurst exponent < H < then X has 
long memory with negative long-range autocorrelations. 



4. Estimators of H 

Under some assumptions on JCj^there are several esti- 
mators of H that are unbiased in the limit of infinite sam- 



ple size k ( Taqqu et al. 1995 ) , and that are more robust 



against noise in the underlying time series than is estima- 
tion of the a. However, the performance of such estima- 
tors on empirical data, which is finite in size and may not 
conform to these assumptions, varies considerably. |Rea| 
et al. ( 2009) reviewed both the asymptotic properties and 
finite-sample performance of several Hurst-exponent esti- 
mators. Different estimators are more common in differ- 
ent disciplines, although such choices tend to be for his- 
torical reasons, rather than based on performance. Some 
of the most commonly-used estimators are: 

• The R/S statistic and modified R/S statistic, which 
examine the scaling of the difference between the 
maximum and minimum departures of Y from a 
random walk in which all jump sizes are equal to 
the mean jump size. However, short-range auto- 
correlations are known to affect the performance of 
the R/S statistic when applied to finite data sets. 



Most commonly, it is assumed that X is a fractional Brown- 
ian motion (i.e., a self-similar process with stationary Gaussian 



increments |Beran[ |r994| |Robinson| |2003| )). 



16 



and the modified R/S statistic often fails to detect 
long memory when in fact it is known to be present 



(Lo 1989 Teverovsky ei a/. 1999) 



• Log-periodogram regression, which estimates H via 
ordinary least squares regression in the spectral do- 
main, rather than directly from the autocorrelation 



function ( jGeweke and Porter-Hudak 1983). 



• Order-m detrended fluctuation analysis (DFAm), 
which examines how the standard deviation of 
terms of the form given in Equation ([t]) scales with 
but after first removing any local polynomial 



trends of order less than or equal to m teantelhardt 



et al. 


2001 


Peng et al. 


1994 



on a time series that includes polynomial nonsta- 
tionarities, and has been found to provide consis- 
tent estimates of H for a very general class of time 



series X (La Spada and Lillo 2011). Furthermore, 



Xu et al. ( 2005 1 found DFAm to produce estimates 
with smaller variance and smaller bias than those 
of several other estimators when applied to finite 
datasets. DFAm is also a useful tool for studying 
short-memory processes, as it is able to provide an 



estimate of t for such a process ( Kantelhardt et al. 



2001 Peng et al. 1994) 



Much like with the estimation of power laws discussed 
in Section [IILLI no single estimator has emerged as the 
best to adopt in all situations, so it has become com- 
mon to report the estimates made by several estimators 
and to draw inference about H based on these results 



(Taqqu et al. 1995). However, DFAm clearly offers sub- 
stantial practical benefits over the other estimators dis- 
cussed here. 



In order to aid comparisons between different studies, 
we present in Appendix [A] a description of all the em- 
pirical studies discussed in this survey that focus partic- 
ularly on LOBs. We include information regarding the 
date range, source, and type of data that these stud- 
ies were based on. As is clear from the table, a large 
number of aspects related to order placement have been 
studied (including the distribution of relative prices for 



newly arriving orders (Bouchaud et al. 2002 Gu et al. 


'2008b! 


HoUifield et a .\ 


2004 


Potters and Bouchaud 


i2003f jZovko and Farmer 


2002 


1; the distribution of sizes 



for newly arriving limit and market orders (Bouchaud 



et al. \M)T; Challet and Stinchcombe 2001] Gopikrish- 



nan et al. 2000j |_Maslov and Mills 



the arrival rate of orders (Biais et al. 



et al 
2010 



2001 



'2002'; Challet and Stinchcombe 



Mu et al 



1995 



Maskawa 20071 Mike and Farmer 



2004)); the depth profile (Biais et al. 



20091; 



Bouchaud 



200T1 Cont et al. 



2008 



1995 



Ranaldo 



Bouchaud 



eTon [20021 |Gu et a l\ 2008c; HoUifiel d et aLj |2004[ |Pot 
ters and Bouchaud 2003; ,Ro§u, ,2009,) ; order cance llation 



and price changes (Cont et al. 2011 



rates (Cao et al. 2008 Challet and Stinchcombe 2001 



Hasbrouck and Saar 2002 Potters and Bouchaud 



Gu et al. 2008a Plerou and Stanley 



Eisler et al. 



2008 Zhou 



20031; 



2010 



20121. 



We now discuss the main findings of these empirical stud- 
ies in more detail, and examine how such findings can 
deepen understanding about certain aspects of limit or- 
der trading. We also discuss a selection of stylized facts 
that have consistently emerged from several different em- 
pirical examinations of LOB data. As we highlight in 
Section |Vj such stylized facts can be a valuable tool for 
assessing LOB models, as a model's failure to reproduce 
the stylized facts is an indication that it poorly mimics 
some aspect of limit order trading. 



IV. EMPIRICAL OBSERVATIONS IN LIMIT ORDER 
MARKETS 

A wide range of LOB features have been studied in the 
empirical literature, often with conflicting conclusions. 
There are many possible reasons for such disagreements, 
including different markets operating differently (perhaps 
for cultural reasons or simply because trade matching al- 
gorithms operate differently) , different asset classes being 
traded on different exchanges, differing levels of liquid- 
ity in different markets, and different researchers hav- 
ing access to different quality data. Furthermore, differ- 
ent authors have studied data from different years, and 
as market participants' trading strategies have evolved 
over time, so too have the statistical hallmarks they cre- 
ate. This is a particularly important consideration in 
recent years, as electronic trading algorithms have come 
to play an increasingly prominent role in markets and 
have caused an increase in both competition and trading 
volumes. 



A. Order size 

When submitting a new order, a market participant 
must select its size. Given the heterogeneous motiva- 
tions for trade that exist within a single market, it is 
unsurprising that there is a substantial variation in the 
size of incoming orders; yet a number of regularities have 
also been observed in empirical data. 

For for equities traded on the Paris Bourse, the dis- 
tribution of logdwa;!) was reported to be approximately 
uniform for incoming limit orders with 10 < \ijJx\ < 50000 
(Bouchaud et al. 2002). For two stocks traded on NAS- 



DAQ, independent fits to the distribution of incoming 
limit order sizes \uix\ were made for different relative 
prices (Maslov and Mills 2001). Power-law and log- 



normal distributions were reported. The mean reported 
power-law exponent was 1 ± 0.3 (i.e., with standard de- 
viation 0.3). However, the quality of the power-law fits 
was deemed to be weak; and the log-normal fits were 
deemed to be applicable over a wider range of limit or- 



17 



der sizes than the power-law fits (although the authors 
stated no precise range of applicability for either). For 
four stocks on the Island ECN, incoming limit order sizes 
\ujx\ were reported to cluster strongly at "round number" 



Such a power law has been reported for the distribu- 
tion of relative prices for orders that arrive with a non- 



negative relative price on the Paris Bourse (Bouchaud 
efoD [20021), NASDAQ (IPotters and Bouchaud, 2003"), 



amounts, such as 10, 100, and 1000 ( [Challet and Stinch-| the LSE ( |Maskawa[ |2007[ |Zovko and Farmer[ |2002| ) , and 



combe 2001). A similar "round number" preference was the Shenzhen Stock Exchange (Gu et al. 2008b I; how- 



observed for market orders on the Shenzhen Stock Ex- 



change (Mu et al. 2009). The authors also studied the 



distribution of total trade sizes when aggregated over a 
variety of time windows, and found it to exhibit a power- 
law tail. Different power-law exponent estimators were 
found to produce different estimates of the tail exponent, 
but overall the authors judged the tail exponent to be 
larger than 2. Similar power-law fits were also reported 



ever, different values of the exponent in the power law 
have been observed in different markets. A value of ap- 
proximately 0.6 was reported to fit the distribution of 
relative prices from 5p to over 100(5p (even up to 1000(5p 
for some stocks) on the Paris Bourse, for buy orders and 
sell orders alike. The power-law exponents, as well as the 
ranges of relative prices over which the distribution was 
reported to follow a power law, were found to vary across 



on NASDAQ (Maslov and Mills 2001). Studying 5 days the stocks studied on NASDAQ (Potters and Bouchaud 



of data covering 3 equities altogether; the mean reported 
power-law exponent was 1.4±0.1. Although the authors 
did not state a range of sizes over which their reported 
power-law distributions applied. Figure 1 in the publica- 



tion (Maslov and Mills 2001) suggests an approximate 



range of 200 to 5000. Power-law fits were also reported 
for the distribution of trade sizes in a study of the 1000 



largest equities in the USA ( Gopikrishnan et al. 2000). 



The mean reported power-law exponent was 1.53 ± 0.07. 



However, Bouchaud et al. (2009) noted that the data 
studied by Gopikrishnan et al. contained information 
about trades that were privately arranged to occur "off- 
book", and therefore were not conducted via the LOB. 
They conjectured that as larger traders were more likely 
to be arranged off-book, Gopikrishnan et al. had actu- 
ally overestimated the frequency with which very large 
orders occurred in the LOB. 

Buy (respectively, sell) market orders with a size \uix\ > 
n{a{t),t) (respectively, uix > \n{b{t),t)\) - i.e., those that 
"walk up the book" - were found to account for only 0.1% 
of submitted market orders on the Stockholm Stock Ex- 
change ( Hollifield et al. 2004 ) . Therefore, the vast ma- 



jority of submitted buy (respectively, sell) market orders 
were found to match only to limit orders at a{t) (respec- 
tively, 6(t)), not at prices deeper into the LOB. 



B. Relative price 

Along with its size, the other decision that a market 
participant must make when submitting an order is its 



price. As discussed in Section II. A regularities in price 
series are best investigated via the use of relative pricing, 
as b{t) and a{t) themselves evolve through time. Un- 
like the distribution of order size, where diferent markets 
seem to exhibit different functional forms, the distribu- 
tion of relative prices appears to exhibit a power-law be- 
haviour in all studied markets. This shows that some 
market participants place limit orders deep into the LOB, 
which may suggest that they hold an optimistic belief 
that large price swings might occur. 



2003). For both buy and sell orders, the value of the 



power-law exponent was reported to be approximately 
1.5 for relative prices between lQ6p and 2000Sp on the 
LSE. A power law with an exponent of 1.72 ± 0.03 for 
buy orders and 1.15 ± 0.02 for sell orders, was fitted 
to the distribution of non- negative relative pricej^ from 
the aggregated dataset of all 23 studied stocks on the 
Shenzhen Stock Exchange, however the exact matching 
rules on the Shenzhen Stock Exchange prevent large price 
changes from occurring within a single day, so some care 
must be taken when comparing these values to the ones 
from other markets. An asymmetry between buy orders 
and sell orders was reported for the Shenzhen Stock Ex- 
change, but not for the other markets. 

A power law was also reported for the distribution of 
relative prices for orders that arrive with a negative rel- 
ative price on the Shenzhen Stock Exchange. The ex- 
ponent was found to be 1.66 ± 0.07 for buy orders and 
1.80 ± 0.07 for sell orders. 

The distribution of all (i.e., both non- negative and neg- 
ative) relative prices for the Astrazeneca stock on the 
LSE (Mi ke and Farmer[ [2008| was found to follow a Stu- 
dent's t distribution with 1.3 degrees of freedom. Such 
a distribution has power-law tails, with an exponent of 
2.3. The mode of the distribution was found to occur at a 
relative price of (i.e., at b{t) or a{t)). The maximum ar- 
rival rate was also observed to occur at a relative price of 



on the Shenzhen Stock Exchange ( Gu et al. 2008b ), the 



Paris Bourse ( Biais et al. 1995 Bouchaud et al. 2002 1 , 



NASDAQ (Challet and Stinchcombe 2001 ), and the LSE 



(Mike and Farmer 2008), although on the Tokyo Stock 



Exchange the maximum was found to occur inside the 
spread (Cont et al. 2010). 



Notice that the notation used by |Gu et oT] | |2008b| assigns the 
opposite signs when measuring relative price than those used 
here. 



18 



C. Limit order cancellations 

Along with the submission of orders, cancellations play 
a major role in the evolution of L{t). A limit order that 
was attractive to its owner at the time of submission 
may, for a variety of reasons, become unattractive at a 
later time, and therefore require cancellation to avoid 
an undesired matching. In a detailed study of activ- 



(Gu et al. 2008c), however this is unsurprising consider- 



ity on the Australian Stock Exchange, Cao et aL (2008) 



found that market participants viewed cancellations (and 
amendments, which are reported in most data sets as a 
cancellation followed by a new limit order submission) as 
an integral part of their strategy when dealing with limit 
order markets. Furthermore, cancellations play a central 
role for many electronic trading algorithms, which often 
use limit order submissions that are almost immediately 
followed by cancellations in an attempt to detect hidden 



liquidity (Hendershott et al. 20111 



Approximately 25% of limit orders placed on the Island 
Electronic Communications Network (ECN) during the 
fourth quarter of 1999 were cancelled within two seconds 
of being placed, and 40% were cancelled within 10 sec- 



onds of being placed ( Hasbrouck and Saar 2002 ) . More- 



over, the observed rate of cancellations was found to de- 



crease monotonically away from h{t) and a{t) (Potters 
and Bouchaud 2003). A further study reported that as 



many as 80% of the limit orders for a selection of equi- 
ties on Island ECN (namely, Cisco, Dell, Microsoft, and 
Worldcom) ended in cancellation rather than matching 
(Challet and Stinchcombe 2001). The distribution of 



the length of time between placement and cancellation 
for cancelled orders was found to exhibit large spikes at 
90, 180, 300, and 600 seconds. The authors proposed 
that such spikes could result from an automatic mecha- 
nism for cancelling active orders after a specified time. 



D. Mean relative depth profile 

Despite their different resolution parameters (see Sec- 
tion II. A ) and the different prices at which trades occur 
a number of qualitative regularities have been 



in them, 

empirically observed in mean relative depth profiles from 
a wide range of markets. 

No significant difference has been detected between the 
mean bid-side and the mean ask-side relative depth pro- 



ing that this market has an additional rule restricting the 
size of price movements on any given day, which imposes 
an asymmetric restriction on the range of relative prices 
that may be selected after a price move has occurred. 

Mean relative depth profiles have also been found 
to exhibit a "hump" shap^^ in a wide range of mar- 
kets, including^the^ajis^oursejB^ 2002), 
NASDAQ (IPotters and Bouchaudl 120031) , the Stockholm 



Stock Exchange (Hollifield et aL, 2004), and the Shen 



zhen Stock Exchange (Gu et al. 2008c I. The maximal 



mean depth available for SPY was found to occur at h{t) 
and a{t), which can also be considered as a hump with its 



maximum at a relative price of ( Potters and Bouchaud 



2003). 



The exact location of the hump has been found to vary 
between different markets, although this is unsurprising 
given their different resolution parameters. The effect 
of changing resolution parameters is twofold: First, so 
long as 5p is sufficiently small in a given market, further 
reduction in 5p would cause the hump would reside at a 
larger number of ticks (i.e., multiples of 5p) from the best 
price than it did before, but only because the measure- 
ment scale had changed. Second, the resolution param- 
eters might themselves affect the way in which market 
participants behave, as discussed in Section [ill. I[ 

There might, of course, also be strategic reasons that 
the hump occurs in different locations for different mar- 
kets. For example, in markets where large price changes 
are relatively common, more market participants may 
choose to submit limit orders with larger relative prices 
than in those where such changes are rare, thereby in- 
creasing the relative price at which the hump resides. 
Ro§u ('2009) conjectured that a hump would exist in 



all markets where large market orders were sufficiently 
likely; this represents a trade-off between the optimism 
that a limit order placed away from h{t) or a[t) might be 
matched to a market order (at a significant profit for the 
limit order holder) and the pessimism that placing limit 
orders too far away from the current bid/ask might be a 
waste of time, as they might never be matched. 



E. Volatility 



For the companies that make up the FTSE 100 (Zum- 



20031, 



files on the Paris Bourse (Biais et al. 
\et al.\ |2002| ), NASDAQ ( [Potters and Bouchaud 
and for Standard and Poor's Depo sitary Receipts 
(SPY|3 ( [Potters and Bouchaud[ |2003[ ). Such symme- 
try was not reported on the Shenzhen Stock Exchange 



1995 Bouchaud bach 2004 ), realized mid-price volatility (using 15 minute 



time windows) was found to be independent of the size of 
the market capitalization of the company, whereas mid- 
price volatility per trade was found be lower for the com- 
panies with larger market capitalization. 



SPY is an exchange traded fund that allows market participants 
to effectively buy and sell shares in all of the 500 largest stocks 
traded in the USA. 



^ More precisely, the absolute value of the mean depth available 
has been found to increase monotonically over the first few rela- 
tive prices, then to decrease monotonically beyond this, thereby 
creating a "hump" in the mean relative depth profile. 



19 



In foreign exchange markets (Zhou 19961, hourly mid- 



price reahzed volatihty was found to follow different 
intra-day patterns for different currency pairs. The au- 
thors conjectured that this was because different curren- 
cies became more heavily traded at different times of day, 
due to the different time zones. For all currency pairs 
studied, daily mid-price realized volatility was found to 
be higher later in the trading week. 



On the Swiss Stock Exchange (Ranaldo 2004), spread 



and volatility were found to follow strong intra-day pat- 
terns. 



F. Conditional event frequencies 

The properties discussed so far in this section have 
all been calculated unconditionally, without reference to 
other variables. However, several factors influence how 
market participants interact with LOBs, so it is reason- 
able to believe that a deeper understanding might be 
achieved by studying not only unconditional frequencies, 
but also the frequencies of those events given that some 
other condition was satisfied. However, the study of 
such conditional event frequencies in LOBs is far from 
straightforward, for two main reasons: 

1. The state space is very large. Deciding which of the 
enormous number of possible events or order book 



states to condition on is very difficult ( Parlour and 



Seppi 20081. 



2. There is a small latency between the time that a 
market participant sends an instruction to submit 
or cancel an order and the time that the exchange 
server receives the instruction. Furthermore, as re- 
fresh signals are only transmitted by the exchange 
server at discrete time intervals, market partici- 
pants cannot be certain that the LOB they observe 
via their trading platform is a perfect representa- 
tion of the actual LOB at that instant in time. 
Therefore, conditioning on the "most recent" event 
is problematic, as the most recent event recorded 
by the exchange (and thus appearing in the mar- 
ket data) may not be the most recent event that a 
given market participant observed via the trading 
platform. 

Nevertheless, several studies of conditional event frequen- 
cies in LOBs have identified interesting behaviour in em- 
pirical data. In this section, we review the key findings 
from several such publications, highlighting both the sim- 
ilarities and differences that have emerged across different 
markets. Most such studies have used older LOB data, 
often dating back 10 years or more. While this does help 
alleviate the difficulties with latency outlined above (as 
the volume of order flows in LOBs was much lower in 
the past than it is today, so the mean inter-arrival times 



between successive events were substantially longer than 
the latency times), it also inevitably raises the question of 
how representative such findings are of limit order trad- 
ing today. 



1. Order size 

A simple example of conditional structure is the rela- 
tionship between \ujx\ and the relative price A^, of orders, 
reported on the Paris Bourse by Bouchaud et al. ( 2002 1 . 



Initially, the unconditional distribution of \u}x\ was esti- 
mated by examining the size of all orders that arrived, 
as discussed in Section pV.A[ However, when the authors 
partitioned incoming orders according to A^, and fitted 
a separate distribution for each value of A^,, substantial 
variation was found between the fitted distributions. In 
particular, those orders with larger relative price were 
found to have a smaller \ujx \ on average. A similar obser- 



vation was made for limit orders on NASDAQ (Maslov 



and Mills 2001 ) 



2. Relative price 



On the Paris Bourse (Biais et al 



Australian Stock Exchange (Cao et al. 



1995) and the 



2008 



Hall and 



Hautsch 2006), market participants were found to place 
more orders with a relative price —s{t) < A^ < (i.e., 
limit orders falling inside the spread) at times when s{t) 
was larger than its median value. Similarly, on the NYSE 



( Ellul et al. 2003 1 the percentage of incoming orders that 



arrived with a relative price A^, > —s{t) (i.e., were limit 
orders) was found to increase as s(t) increased, and was 
found to decrease when s{t) decreased. Biais et al. (1995) 
argued that when s{t) was small, it was less expensive for 
market participants to demand immediate liquidity, so 
more market orders were placed. However, it is also pos- 
sible to explain such an observation via a zero-intelligence 
approach. If limit order prices are chosen uniformly at 
random over some fixed price interval, then it is more 
likely that an incoming limit order price resides in the 
interval {b{t),a{t)) when the interval is wider. 

The percentage of buy (respectively, sell) limit orders 
that arrived with a relative price —s{t) < A^; < on 



the Paris Bourse (Biais et al. 1995) was found to be 



higher at times when \n{b{t),t)\ (respectively, n{a(t),t)) 
was larger. This was conjectured to be evidence of mar- 
ket participants competing for priority, as the only way 
to gain higher priority than the active orders in the (al- 
ready long) queue in this situation is to submit an order 
with a better price. Furthermore, on the NYSE (Ellul 



et al. 2003) the arrival rate of buy (respectively, sell) 
limit orders with a relative price of —s{t) < Aj. < 
was found to increase as the total depth available on 
the bid (respectively, ask) side of the LOB increased, as 



20 



was the rate of buy (respectively, sell) market orders. 



Similarly, on the Australian Stock exchange (Hall and 



Hautsch 2006), the percentage of buy (respectively, sell) 
orders that arrived with a relative price > —s(t) was 
found to decrease as the total depth available on the bid 
(respectively, ask) side of the LOB increased. A more 
recent study of the Australian Stock Exchange also re- 
ported that the proportion of arriving orders with a rel- 
ative price of < —s{t) (i.e., market orders) was found 
to be higher when \n{b{t),t)\ and n{a{t),t) were larger. 
By contrast, on the Paris Bourse \n{b{t),t)\ (respectively, 
n{a{t),t)) was found to have little impact on the rate of 
incoming sell (respectively, buy) orders with a relative 
price Ax < —s(t). 

By contrast, the distribution of relative prices has been 
found to be independent of the spread on the LSE (Mike 
and Farmer 2008[ ) and the Shenzhen Stock Exchange 



(Gu et at. 2008bp. The distribution of relative prices 



was also found to be independend of volatility on the 
Shenzhen Stock Exchange. 



On the LSE (Maskawa 2007), market participants 



were found to favour placing their limit orders at rela- 
tive prices similar to those where there was already a 
large proportion of active orders. 



3. Arrival rates 



On the Stockholm Stock Exchange (Sandas 2001), 



order flows at time t were found to be conditional on 
both L{t) and on previous order flows. For the Deutsche 
Mark/US dollar and Canadian dollar/US dollar currency 
pairs, traded on foreign exchange markets ( |Lo and Sapp 



2010), order flows at time t were found to be conditional 



on several LOB variables, including s(t), n{b(t),t) and 
n{a{t),t), depth available behind the best prices, time of 
day, and recent order flows. However, the precise struc- 
ture of such conditional dependences was found to vary 
between the two different currency pairs. 

Using several different financial instruments traded in 



electronic LOBs, Toke (2011) found that arrival rates of 



limit orders increased on both sides of the LOB following 
the arrival of a market order. No evidence that market 
order arrival rates increased following the arrival of a 
limit order was found. 



Using data from 40 stocks on the Paris Bourse, Bi- 



ais et al. ( 1995 ) studied the frequencies with which mar- 



ket events belonged to each of 15 different action classes 
(such as "arrival of buy market order", "arrival of buy 
limit order within the spread" , and "cancellation of ex- 
isting sell limit order"), conditional on the action class 
of the most recently recorded event. The conditional fre- 
quency with which a market event belonged to a spec- 
ified action class, given that the previous market event 
also belonged to the same action class, was found to be 
higher than the unconditional frequency with which mar- 



ket events belonged to that action class, for all action 
classes. Such behaviour is known as event clustering. 
The authors offered numerous possible explanations for 
this phenomenon: market participants may have been 
strategically splitting large orders into smaller chunks to 
avoid revealing their full trading intentions (or to mini- 
mize market impact, as discussed in Section IV.G ); differ- 



ent traders may have been mimicking each other; differ- 
ent traders may have been independently reacting to new 
information; or different traders may have been trying to 
undercut each other (i.e., cancelling active buy (respec- 
tively, sell) orders and resubmitting them at a slightly 
higher (respectively, lower) price solely to gain price pri- 
ority). The authors noted that small, successive changes 
in h{t) and a[t) were observed more frequently when s{t) 
was large, which they argued provided evidence of under- 



cutting. However, Bouchaud et al. (2009) concluded that 



the phenomenon was driven primarily by strategic order 
splitting; and found no evidence that different traders 
mimicked each other's actions. 

On the Swiss Stock Exchage, order flow was found to 
depend on a number of factors, including volatility, re- 
cent order flow, and the state of the limit order book 



L{t) (Ranaldo 2004). Market participants were found to 



submit more limit orders and fewer market orders during 
periods when volatility or s{t) were high. The propor- 
tion of orders that arrived with negative relative price was 
found to decrease as the inter-arrival time between recent 
orders increased. Market participants were found to sub- 
mit higher-priced buy orders (respectively, lower-priced 
sell orders) when there was a greater total depth available 
on the buy side (respectively, sell side) of the LOB. Buy 
order submission was found to depend also on the state of 
the opposite (i.e., sell) side of the LOB, whereas sell or- 
der submission was found to depend only on the state of 
the same (i.e., sell) side of the LOB. Ranaldo noted that 
such asymmetry may have been caused by market per- 
formance over the sample period (the percentage change 
in m{t) was positive for all but one of the stocks stud- 
ied, and exceeded 10% for 4 of the stocks studied), but 
may also have been caused by buyers and sellers behaving 
differently from each other in the market. 



On the NYSE (EUul et al. 2003 ), periods of time with 



above-average order flow rates were found to cluster to- 
gether, as were periods with below-average order flow 
rates. The rate of limit order arrivals was also observed 
to be higher late in the trading day. The rate of buy 
(respectively, sell) limit order arrivals was found to in- 
crease after periods of positive (respectively, negative) 
mid-price returns. A similar event clustering as was ob- 



served by Biais et al. ( 1995 1 on the Paris Bourse was also 



reported. However, the number of occurrences of market 
events from a specific action class in a given five minute 
window (and hence the average rate of such occurrences 
over that window) and the number of occurrences of the 
market events from the same action class in the previous 



21 



five minute window were found to be negatively corre- 
lated. Furthermore, the arrival rate of market events 
from a given action class was found to be more heavily 
conditional on the action class of the single most recent 
market event than it was on L(t), whereas the distribu- 
tion of the number of occurrences of market events from 
a given action class in a given five minute window was 
found to be more heavily conditional on L{t) during the 
previous five minute window than it was on the number 
of occurrences of market events from any specific action 
class in the previous five minute window. 



On the Australian Stock Exchange ( Hall and Hautsch 



2006 ) , the arrival rates of all market events were reported 



to increase and decrease together. The authors suggested 
that there might, therefore, have existed other exoge- 
nous factors (that they had not measured) that influ- 
enced LOB activity overall. In a more recent study of 



the Australian Stock Exchange (Cao et al. 2008), the 



arrival rates of market events at time t was found to be 
conditional on L{t), but not significantly conditional on 
the state of the LOB at earlier times. This suggests that 
market participants evaluated only the most recent state 
of the LOB, and not a longer history, when making or- 
der placement and cancellation decisions. No evidence 
that mid-price returns had a significant impact on order 
arrival or cancellation rates was found. 



4. Cancellations 



On the Paris Bourse (Biais et al. 19951, cancellations 



on each side of the LOB were found to occur more fre- 
quently after a matching on that side of the LOB. The 
authors conjectured that this was evidence of market par- 
ticipants submitting large orders in the hope of finding 
hidden liquidity, then cancelling any unmatched portions 
of such orders. 



On the Australian Stock Exchange (Cao et al. 2008) 



priority considerations have been observed to play a key 
role for market participants when deciding whether or 
not to cancel their existing active orders. The cancel- 
lation rate for active orders was found to increase when 
new, higher-priority limit orders arrived on the same side 
of the LOB. In addition, the cancellation rate of active 
buy (respectively, sell) orders at prices p < b(t) (respec- 
tively, p > a{t)) was found to increase when n(j) — Sp,t) 
(respectively, n{p+Sp,t)) became zero. The authors sug- 
gested that this was because market participants with 
limit orders at price p could cancel then resubmit them, 
at price p — 5p (respectively, p + Sp) , to possibly gain a 
better price for the trade (if the limit order eventually 
matched) without substantial loss of priority. No similar 
increase in cancellations was observed when n{p + Sp, t) 
(respectively, n{p — Sp, t)) became zero. 



5. Price movements 

On the Paris Bourse, a{t) was found to be more likely 
to decrease (respectively, b{t) was found to be more likely 
to increase) immediately after the arrival of a market or- 
der that had caused b{t) to decrease (respectively, a{t) 
to increase) ( Biais et al.\ 1995). The authors suggested 
that such behaviour could be due to market participants 
reacting to some kind of "information" , either via exter- 
nal sources of news causing a revaluation of the underly- 
ing asset or via the downward movement of b{t) (respec- 
tively, upward movement of a(t)) itself being interpreted 
by other market participants as news. Indeed, [Potters] 
and Bouchaud (2003) found evidence that on NASDAQ, 



each new trade (by its very existence) was interpreted by 
market participants as a piece of new information that 
had a direct effect on the flow of incoming orders (and, 
therefore, on prices). 



6. Incoming order prices 

On the LSE, the relative prices of incoming limit or- 
ders were found to be conditional on the bid-price real- 



ized volatility per trade ( Zovko and Farmer , 2002 ) . More 



precisely, two time series were constructed by calculating 
the mean relative price of arriving buy limit orders and 
the bid-price realized volatility per trade over 10 minute 
windows. The cross-correlation of the two time series 
was calculated, and the hypothesis that they were un- 
correlated was rejected at the 2.5% level. Changes in 
bid-price realized volatility were found to immediately 
precede changes in mean relative price for buy limit or- 
ders A similar behaviour was observed when compar- 
ing the time series of ask-price realized volatility and the 
time series of mean relative price for sell limit orders. 



In foreign exchange markets (Lo and Sapp 20101, 



during 30 minute windows with high mid-price realized 
volatility, market participants were found to submit or- 
ders with higher relative prices on average. 



^® The authors noted that it was not clear from the cross-correlation 
function alone whether a change in bid-price realized volatility 
directly caused a change in how market participants chose the 
relative prices for their buy limit orders shortly thereafter, or 
whether there was some other external factor that first affected 
bid-price realized volatility and then affected relative limit prices 
for buy limit orders. If the latter could be demonstrated, it would 
support the widely-held belief that many market participants 
consider realized volatility to be an important factor in making 
the decision of when to place a limit order ( [Zovko and Farmer] 
120024. 



22 



7. Order flow 



For Canadian stocks ( Hollifield et al. 2006[ ) , a range of 
different volatility measures were found to be correlated 
with order flow rates. For the Deutsche Mark/US dol- 
lar and Canadian dollar/US dollar currency pairs, traded 
on foreign exchange markets, realized mid-price volatility 



was found to affect order flows (Lo and Sapp 2010j). For 
US equities traded on Island ECN, and using a variety 
of different volatility measures, periods of higher volatil- 
ity were found also to have a lower proportion of limit 



orders in the arriving order flow (Hasbrouck and Saar 



2002). During such periods, submitted limit orders were 
found to have an increased probability of execution, and 
a shorter expected time until execution. Furthermore, 



on Euronext (Chakraborti et al. 2011b I and for German 



Index Futures (Kempf and Korn 19991, mid- price real 



ized volatility was found to increase with the number of 
arriving market orders in a given time interval. A simi- 



lar finding was reported in a study of the NYSE (Jones 



et al. 1994); however, a later study of the NYSE (El 



lul et al. 20031 reported a positive correlation between 
higher mid-price realized volatility and the percentage of 
arriving orders that were limit orders. 

On the Australian Stock Exchange, the number of ar- 
rivals and cancellations of large limit orders (i.e. whose 
size was in the upper quartile of the unconditional empir- 
ical distribution of order sizes) in a given 5 minute win- 
dow were found to be positively correlated with mid-price 
realized volatility during that window and also during 



2006). 



found 



the previous 5 minute window ( Hall and Hautsch 
However, a more recent study (Cao et al. 2C 
that mid-price realized volatility per trade had only a 
minimal effect on order flows. 



8. Limit order book state 

A positive (but weak) correlation has been observed 
between s{t) and realized mid-price volatility in a wide 
range of markets (see Wyart et al. ( 2008 ) and references 
therein). However, a much stronger positive correlation 
has been observed between s{t) and mid-price volatil- 
ity at the trade-by-trade timescale on the Paris Bourse 



(Bouchaud et al. 



2004), the FTSE 100 (Zumbach 2004), 



and the NYSE (Wyart et al. 2008). A recent study of 



stocks traded on the NYSE ( [Hendershott et al.\ |2011[ ) 
found the once-daily time series of bid-price realized 
volatility to be positively correlated with the daily mean 
spread. Stocks with a lower mid price were found to have 
higher bid-price realized volatilty on average. In foreign 



exchange markets (Lo and Sapp 2010), the variance of 



the depth available at any given price was found to in- 
crease during periods of high mid- price realized volatility. 



On the Island ECN (Hasbrouck and Saar 2002), links 



were investigated, but were all found to be weak. 

As discussed in Section |III.A[ mid-price intra-day 



volatility on the Sydney Futures Exchange ( Bortoli et al. 



2006) was found to vary according to how much infor- 
mation about the depth profile market participants had 
access to in real-time. 



G. Market impact and price impact 

A key consideration for a market participant who 
wishes to buy or sell a large quantity of an asset is 



how their own actions might affect the LOB (Almgren 



and Chriss 


2001 IBouchaud et al. 


20091 


Cont et al.\ 


2011 


Eisler et al. 


2010 


Obizhaeva and Wang 


20051. 



For example, imagine market participant A wishes to 
buy 20(7 shares of Company X in the LOB displayed 
in Figure |6] Submitting a single market order of size 
LOx = —20(7 would result in market participant A pur- 
chasing 2(7 shares at the price $1.5438, 5cr shares at the 
price $1.5439, 6cr shares at the price $1.5440, and 7cr 
shares at the price $1.5441. However, if market partici- 
pant A were initially to submit only a market order of size 
Wj. — —2(7, it is possible that other market participants 
would submit new limit orders, because by purchasing 
the 2(7 shares with highest priority in the LOB, mar- 
ket participant A has made it more attractive for other 
participants to submit new sell limit orders than it was 
immediately before such a purchase. Market participant 
A could then submit a market order to match to these 
newly submitted limit orders, and repeat this process un- 
til all 20cr of the desired shares are purchased. Of course, 
there is no guarantee that the initial market order of size 
2a would stimulate such submissions of limit orders from 
other market participants. Indeed, it could even cause 
some other market participants to cancel their existing 
limit orders or to submit buy market orders, pushing a[t) 
up further and thereby ultimately causing market partic- 
ipant A to pay a higher price for the overall purchase of 
the 20a shares. However, such order splitting has been 
empirically observed to be very common in a wide range 



of different markets ( Bouchaud et al. 2009 1 



relating volatility to various aspects of the depth profile 



The change in 6(t) and a{t) caused by a market partic- 
ipant's actions is called the price impact of the actions. 
The necessity for market participants to monitor and con- 
trol price impact predates the widespread uptake of limit 
order trading. In a quote-driven system, for example, any 
single market maker only has access to a finite inventory, 
so there are limits on the size that is available for trade 
at the quoted prices. Once a market participant reaches 
these limits, any further trading will have to take place 
with different market makers, at worse prices. Further- 
more, purchasing or selling large quantities of the asset 
could cause market makers to adjust their quoted prices. 
Both of these outcomes are examples of price impact. 
In a limit order market, however, it is possible to con- 



23 



- g 

ro 8 
> 

I ^ 

oj 6 
Q 

5 
4 
3 
2 
1 


-4 



1 .5435 1 .5436 



s 



1.5437 1.5438 1.5439 1.5440 1.5441 



(USD) 



FIG. 6 A LOB for shares in Company X 



Second, agents might successfully forecast short-term 
price movements and choose their actions accordingly!^ 
Third, purely random fluctuations in supply and demand 
might lead to permanent impact. 

The various forms of instantaneous price impact and 
instantaneous market impact are defined as follows: 

Definition. The instantaneous bid-price impact of a 
market event at time t' is 

limb(t) - liinb(t). 
tit' t^t' 

Instantaneous ask-price impact and instantaneous mid- 
price impact are defined similarly. 

Definition. The instantaneous bid-price logarithmic re- 
turn impact of a market event at time t' is 



sider a more general form of impact: the change in the 
entire LOB state L{t) that is caused by a market par- 
ticipants' actions. Such impact is called market impact. 
Although, to date, the terms "price impact" and "mar- 
ket impact" have often been used interchangeably to refer 



only to changes in b{t) or a(t), recent work (Hautsch and 



Huang 2009 1 has shed light on how market participants' 



actions can affect the depths available at other prices, 
suggesting that it is appropriate to separate the two no- 
tions. Bouchaud et al. (20091 provides a detailed review 



of studies of both price and market impact. 

Both price impact and market impact are difficult to 
quantify formally, as they each consist of two compo- 
nents: 

• Instantaneous (or immediate) impact: the immedi- 
ate effects of the specified action. 

• Permanent impact: the long-term impact due to 
the specified action causing other market partici- 
pants to behave differently in the future. 

For example, the instantaneous price impact of a market 
buy order of size 2cr in the LOB displayed in Figure |6] is 
a change in a{t) from $1.5438 to $1.5439. An example 
of permanent market impact of this buy market order 
might be another market participant deciding to submit 
a new sell limit order at a price of $1.5442. 

Instantaneous impact exists because the arrival or can- 
cellation of any order affects L{t) directly. Bouchaud 



et al. (20091 described three reasons why permanent 
market impact might exist. First, trades themselves 
might convey information to other market participantsp^ 



lim(log6(t)) - lim(log6(t)). 

tit' t^t' 

Instantaneous ask-price logarithmic return impact and 
instantaneous mid-price logarithmic return impact are 
defined similarly. 

Definition. The instantaneous bid-price impact func- 
tion (j)i,[LJx) outputs the mean instantaneous bid-price im- 
pact for a buy market order of size ujj. < 0. The instanta- 
neous ask-price impact function (f)a{u}x) outputs the mean 
instantaneous ask-price impact for a sell market order of 
size LUx > 0. The instantaneous mid-price impact func- 
tion (j)m{\^x\) outputs the mean instantaneous mid-price 
impact for a market order of size \uJx\. 

Definition. The instantaneous bid-price logarithmic re- 
turn impact function ^i,{ujx) outputs the mean instanta- 
neous bid-price logarithmic return impact for a buy mar- 
ket order of size w^, < 0. The instantaneous ask-price 
logarithmic return impact function ^ai^x) outputs the 
mean instantaneous ask-price logarithmic return impact 
for a buy market order of .size uJx > 0. The instantaneous 
mid-price logarithmic return impact function ^mdw^^l) 
outputs the mean instantaneous mid-price logarithmic re- 
turn impact for a market order of size \ujx\. 

Definition. The instantaneous market impact of a mar- 
ket event at time t' is 



limL(t)\\imL(t). 
tit' ttt> 



iGrossman and Stiglitz] | |1980| | introduced this idea for a general 
market, and it has since been discussed extensively in a LOB 
context (see, e.g., ([Almgren and C hriss 2001 Bo uchaud et al^ 
|2009[|Hasbrouck|[l991| [Potters aiid Bouchaud, ,2003^ ^ 



This explanation suggests that it is not market participants' ac- 
tions that cause prices to rise or fall. Instead, such movements 
happen exogenously and market participants align their actions 
with them to maximize profits. Boucha ud et aL\ | |2009[ l did not 
find evidence that this was a good reflection of reality. 



24 



It is not possible to quantify precisely the permanent 
price (respectively, market) impact of an action, because 
doing so would involve making a comparison between the 
values of b{t) and a{t) (respectively, L{t)) if the action 
hadn't happened and the values (respectively, state) if it 
had. If the action did happen, it is not possible to know 
what the values (respectively, state) would have been if 
it hadn't, and vice- versa. 

We now examine instantaneous and permanent price 
impact in more detail. 



J2 4 
^3 

qO 
-1 
-2 
-3 
-4 



1.47 1.48 1.49 1.50 



Ezr 



1.51 
— I — 



1.52 1.53 1.54 



1.55 Price 
(USD) 



FIG. 7 An example LOB 



1. Instantaneous price impact 

To date, the study of instantaneous price impact for 
individual market orders has been conducted primarily 
via the study of instantaneous price impact and instanta- 
neous logarithmic return impact functions. On the NYSE 



and American Stock Exchange (Hasbrouck 1991), 



was found to be a concave function of \uJx\- This means 
that the instantaneous price impact of a single market or- 
der of size \lUx I was, on average, larger than the sum of the 
instantaneous price impacts of two market orders xi and 
X2 of sizes luJxA and \ujxn\, such that lujx 



Lillo et al. (2003) studied the stocks of 1000 different 



companies traded on the NYSE, and sorted them into 20 
groups according to their market capitalization (i.e., the 
total value of all of a given company's shares). Within 
each group, they then merged their market data and fit- 
ted a single curve to $m(|a;a;|). They found that could 
be fitted with the power law $m(|wa;|) = for all 20 

groups. The value of a was found to vary between the dif- 
ferent groups, taking values between approximately 0.2 
and 0.5, although no goodness-of-fit statistics were pre- 
sented with the results and it is not clear how well the fits 
performed for individual stocks, rather than aggregated 
groups. 

After performing the change of variables 



P 



where C is the average market capitalization for stocks 
in the group and S and 7 are fitted constants, they found 
that <i>m'(|a;^|) for each of the 20 groups collapsed onto a 
single "universal" curve. 

A similar collapse of 4>m onto a single (power-law) 
curve $m'(|a;tl) = ' was observed for 11 stocks 

all 120051), under the 



traded on the LSE (Farmer et 
change of variables 



P ■= 



pX 
1^ 



where /x. A, and a were the average arrival rate of market 
orders, the average arrival rate of limit orders, and the 
average cancellation rate of active orders per unit size u, 
respectively. 



Using data from the Shenzhen Stock Exchange, Zhou 



(2012) partitioned incoming orders according to whether 
or not they received an immediate full matching. The 
reason for doing so is as follows. Consider the LOB dis- 
played in FigurejTj If a buy order x = ($1.54, ujx) arrives, 
one of following three possible cases will applyxolorblue 

1. If uJx — o', then n($1.54, t) = 2(t — a = a after its 
arrival. Therefore, b{t) and a{t) remain unchanged, 
so m(t) remains unchanged. 

2. If UJx = 2cr, then n($1.54,i) = 2a - 2a = after 
its arrival. Therefore, b{t) remains unchanged and 
a(t) increases to $1.55, so m{t) increases by 0.005. 

3. If UJx ~ 2a, then n($1.54, t) = 2a — ujx < after its 
arrival. Therefore, b{t) increases to $1.54 and a{t) 
increases to $1.55, so m{t) increases by $0,025. 

In summary, incoming orders that are fully matched 
upon arrival always have a strictly smaller instantaneous 
mid-price impact than those orders that are not. This 
was reinforced by the data from the Shenzhen Stock Ex- 
change, where incoming orders that were only partially 
matched upon arrival were found to have much larger in- 
stantaneous mid-price impact than incoming orders that 
were fully matched upon arrival. Furthermore, ^mi'^x) 
was found to take a different functional form in the two 



• For incoming orders that were only partially 
matched upon arrival, ^mdwa;!) was found to be 
constant for all \ujx\ < 10000 shares, then to in- 
crease for larger values of \ujx\. 

• For incoming orders that were fully matched upon 
arrival, ^mdwa,!) was reported to follow the power 
law ^md^a;!) = A Iwxl", where A is a constant that 
varied from stock to stock. Among buy orders, the 
mean value of a was 0.66±0.05. Among sell orders, 
the mean value of a was 0.69 ± 0.06. 



25 



After applying the change of variables 



100 stocks on the NYSE (Gabaix et al. 2006), the aver- 



'^ni{\0Jx\) 



(k.l) 



where the angle brackets (•) denote the mean value taken 
across all incoming market orders in the data, Zhou found 
that the $^(a;^) curves for all studied stocks collapsed 
onto a single curve for incoming orders that were fully 
matched upon arrival, and onto another single curve for 
incoming orders that were only partially matched upon 
arrival. In particular, the asymmetry between the bid 
side and the ask side of the LOB was no longer present 
after the rescaling. 



On both the Paris Bourse and NASDAQ (Potters 



and Bouchaud 2003), a logarithmic functional form was 
found to provide a better fit to (f>m than was a power-law 
relationship. Furthermore, power-law relationships were 
found to overestimate the mean instantaneous mid-price 
impact of very large market orders on both the LSE and 



the NYSE (Farmer and Lillo 2004) 



2. Permanent price impact 

As discussed above, it is impossible to exactly quan- 
tify the permanent price impact of a market event. How- 
ever, to gain some insight into the longer-term effects of 
market events, several empirical studies have compared 
changes in b(t) and a(t) over specified time intervals with 
measures of trade imbalance. 

Definition. The trade imbalance count during time in- 
terval T = [ti,i2]; denoted Qc{T), is the difference be- 
tween the total number of incoming buy market orders 
and the total number of incoming sell market orders that 
arrive during time interval T. 

Definition. The trade imbalance size during time inter- 
val T = [^1,^2]; denoted f2^(r), is the difference between 
the total absolute size of all incoming buy market orders 
and the total size of all incoming sell market orders that 
arrive during time interval T. 

Using data for Deutsche Mark/US dollar and US dol- 
lar/Yen foreign exchange markets, Evans and Lyons 



(2002) performed an ordinary least squares regression. 



comparing the ask-price logarithmic return between suc- 
cessive trading days and the daily trade imbalance count. 
A statistically significant, positive, linear relationship 
was found between the two variables. 

For German Stock Index futures, the average mid-price 
logarithmic return over a 5 minute window was found to 
be a concave function of the trade imbalance count dur- 



age mid-price logarithmic return was found to follow the 
relationship fi^(T)''-^ for time intervals of length 15 min- 
utes. For the 116 most Hquid stocks in the US between 
1994-1995 (IPlerou et all 120021), for a variety of different 



time interval lengths, the average change in mid-price 
over the interval was found to be a concave function of 
r2tj(T). Furthermore, for small values of ri^(T), the av- 
erage change in mid-price over the interval was found to 
be well approximated by r2tj(T)", with the value of a 
depending on the length of the time interval, and rang- 
ing from a — 1/3 for intervals of length 5 minutes to 
a = I for intervals of length 195 minutes. Similarly, for 
the Astrazeneca stock (traded on the LSE) ( [Bouchaud 



et al. 2009), the average mid-price logarithmic return 



was found to become an increasingly linear function of 
the length of time interval T as the length of T was in- 
creased. 



Cont et al. (2011 1 recently proposed that in limit order 



markets, price impact should be studied as a function of 
the difference between limit order flow on the bid side 
and the ask side of the LOB, rather than of rtu:{T), thus 
acknowledging that cancellations can also have price im- 
pact. Using data from the NYSE, the authors performed 
an ordinary least squares regression of the mean change 
in mid-price over a time window of length 10 seconds 
onto the limit order flow imbalance over the same time 
window, for each stock separately. For 43 out of the 50 
stocks studied, the gradient coefficient of the regression 
was found to be significantly different from 0, working 
at the 95% level. The value of the gradient coefficient 
was found to be larger on average for those stocks with 
smaller mean values of \n{b{t), t)\ and n{a{t), t). The au- 
thors noted that their ordinary least squares regressions 
provided a "surprisingly high" strength of fit across all 
stocks, despite the nuances of how the individual stocks 
were traded. In particular, the mean of the coeffi- 
cients for the individual stocks' regressions was found to 
be 65%. When the regressions were repeated, but using 
^uj{T) rather than limit order flow imbalance as the in- 
dependent variable, the average of the coefficients was 
only 32%. Furthermore, the authors conjectured that any 
observable relationship between price impact and (T) 
was actually a by-product of the fact that fluj{T) was 
correlated with limit order flow imbalance, and that limit 
order flow imbalance was the real driving force. 



3. Market impact 

In contrast to the wealth of empirical studies on price 
impact, little has been published to date on market im- 
pact. However, by considering the impact of market 
events on the depths available not only at b{t) and a{t) 



but also at nearby prices, Hautsch and Huang (2009) 



ing that window ( Kempf and Korn 1999| ) . For the largest were able to track how order arrivals affected the state of 



26 



the LOB L{t) of 30 stocks traded on Euronext. In partic- 
ular, Umit orders placed at a price p < b(t) (respectively, 
P ^ were found to cause a significant permanent 

increase in b{t) (respectively, decrease in a(t)), working 
at the 95% level. Limit orders with a relative price of Sp 
or 2Sp were found to affect b{t) and a{t) only 20% less, 
on average, than limit orders placed at b(t) and a{t) (i.e., 
with a relative price of 0). Limit orders placed with neg- 
ative relative price were found to have significant market 
impact. The market impact of a market order was found 
to be, on average, four times greater than that of a limit 
order of the same size. 

The effect on b{t) and a{t) caused by limit orders ar- 
riving with a relative price of zero was observed to occur 
more quickly than for limit orders arriving with positive 
relative price. The market impact of limit orders placed 
inside the spread was found to be largely instantaneously 
with little permanent impact, whereas limit orders that 
arrived with zero or positive relative price were observed 
to have no immediate impact but significant permanent 
impact. 

Similar results were reported for all stocks studied, but 
asymmetries were found between the bid side and the 



ask side of the LOB (much like Kempf and Korn ( 1999 ) 



found for price impact). The authors conjectured that 
the impact they observed was due partly to arriving or- 
ders triggering an instantaneous imbalance in supply and 
demand, and partly to other market participants inter- 
preting order arrivals as containing information, thereby 
causing them to adjust their own future actions and lead- 
ing to permanent market impact. The results suggested 
that the arrivals of market orders were interpreted by 
market participants as being a particularly strong in- 
formation signal. Such an observation provides a pos- 
sible explanation as to why so many market participants 
choose to place iceberg orders: placing an iceberg order 
is an effective way to hide the true size of limit orders 
from the market, and thus to minimize market impact. 



H. Stylized facts 

Several non-trivial statistical regularities have been de- 
tected in empirical data from a wide range of different 
markets. Such regularities have come to be known as 
the stylized facts of markets, and they may provide in- 
teresting insights into the behaviour of market partici- 



pants (Cont 2001 1 and the structure of markets them- 



selves ( Bouchaud et al. 2009 1 . The stylized facts are also 



useful from a modelling perspective, because a model's 
inability to reproduce one or more stylized facts can be 



Note that, by definition, a buy (respectively, sell) limit order 
placed inside the spread immediately affects b(t) (respectively, 
ait)). 



used as an indicator for how it needs to be improved, or 
used as justification for ruling it out altogether. For ex- 
ample, the existence of volatility clustering eliminates the 
simple random walk as a model for the evolution of m{t) 
through time, as the existence of volatility clustering in 
real mid-price time series implies that large price varia- 
tions are more likely to follow large price variations than 



they are to occur unconditionally (Lo and MacKinlay 



2001[ ). 

Reproduction of the stylized facts has proven to 



be a serious challenge for LOB models (Chakraborti 



et al. 2011b), particularly for those based on zero- 



intelligence assumptions, which have, thus far, produced 
more volatile price series than empirical observations 



have suggested is appropriate (Chakraborti et al. 2011a I. 



This might imply that the strategic behaviour of real 
market participants somehow stabilizes prices, and is 
therefore an important ingredient in real LOB trading. 



Cont (2001) reviewed a wide range of stylized facts 



and their estimation from empirical data; here we dis- 
cuss a small subset that we consider to be the most rel- 
evant from a LOB perspective. The stylized facts pre- 
sented here are of particular theoretical interest because 
they all suggest that non-equilibrium behaviour plays an 
important role in LOBs. A result from statistical me- 
chanics is that systems that are in equilibrium give rise 



to distributions from the exponential family (Mike and 



Farmer 2008), whereas the distributions describing sev- 



eral aspects of LOB behaviour exhibit power-law tails, 
highlighting the possibility that LOBs may always be in 
a transient state. 



1. Heavy-tailed return distribution 

Over all timescales ranging from seconds to days, the 
unconditional distribution of mid-price returns displays 
tails that are heavier than a normal distribution (i.e., 
they have positive excess kurtosis). The exact form of the 
distribution has been found to vary with the timescale 
used. Across a wide range of different markets (e.g., 
( [Gopikrishnan et al.\ |1998[ |Gu et al.\ |2008a[ )), at the 
shortest timescales the tails of the distribution appear 
to be well-approximated by a power law with exponent 
a w 3, thus earning the name "the cubic law of returns" 



in the literature. Stanley et al. (2008) went on to con 



jecture that such a universal power-law tail might be a 
consequence of power-law tails in both the distribution of 
market order sizes and the instantaneous mid-price loga- 
rithmic return impact function. However, |Mu and Zhou| 
(2010) reported that this relationship did not hold in 



emerging markets, and Drozdz et al. (2007) noted that 
the tails are actually thinner, i.e., a > 3, in the most 
recent data, highlighting that the quantitative form of 
stylized facts may themselves change over time as trad- 
ing styles evolve. At longer timescales, the distribution 



27 



becomes increasingly better approximated by a normal 
distribution (a behaviour often referred to as aggrega- 



tocorrelation up to timescales of 20 minutes, but not 25 



tional Gaussianity) (|Cont[ |2001[ [Gopikrishnan et al. 
19991 [Ziiaol|20T0l ) 



minutes ( jBouchaud and Potters[|2003[ ). On the NYSE, a 
publication from 2005 reported that negative autocorre- 



lation persisted at 5 minutes, but not 10 minutes (Cont 



The return distribution has been found to exhibit 



dex (Cont 2001 Gallant et al. 1992 Gopikrishnan et al. tion over time windows of 1 minute (Chakraborti et al. 



heavy tails on Euronext (Chakraborti et al. 2011b), the 



Paris Bourse (Plerou and Stanley 2008 ), the S&P 500 in 



2005) (although no exact date of when the data itself 
was collected was given). However, data from Euronext 
during 2007-2008 exhibited no significant autocorrela- 



1999), foreign exchange markets (Guillaume et al. 1997), 2011b). Furthermore, in NYSE data from 2010, no sig- 



the NYSE (Gopikrishnan et al. 1998), the American nificant autocorrelation was present over any timescales 



LSE (Plerou and Stanley 2008), and the Shenzhen Stock 



Stock Exchange (Gopikrishnan et al. 1998 Plerou and of 20 seconds or longer ( Cont et al. 2011), and for crude 



Stanley 2008), NASDAQ (Gopikrishnan et al. 1998), the oil futures contracts traded in 2005 (Zhao 20101, nega 



Exchange ( Gu et al. 2008a I . The heavy-tailed nature of 



returns has particularly important consequences for mar- 
ket participants, as it highlights that large movements in 
price are more likely than they would be if returns were 
normally distributed, and is central to the management 
of risk when calculating investment strategies. 



five autocorrelation was found to persist for only 10-15 
seconds. In summary, it appears that any negative auto- 
correlation disappears more quickly in more recent mar- 
ket data than it does in older data, again indicating that 
the exact quantitative details of stylized facts may have 
changed over time. 



2. Autocorrelation of returns 

Except on very short timescales, when it exhibits weak 
negative autocorrelation, the time series of mid-price re- 
turns does not display any significant autocorrelation 



3. Long memory 



Several empirical studies have reported LOB time se- 



ries to exhibit long memory (as defined in Section III.M I. 

Time series of absolute or square mid-price returns 
display positive long memory over timescales as long 



(Chakraborti et al. 2011b Cont 2005 Stanley et al. as weeks or even months (Cont 2001 Liu et al. 1997 



tocorrelation in returns can be explained using perfect- 
rationality arguments. If returns were indeed autocor- 
related, rational market participants would employ sim- 
ple strategies that used this fact to generate positive ex- 
pected earnings. Such actions would themselves reduce 
the level of autocorrelation, so autocorrelation would not 
persist. 

The absence of autocorrelation on all but the shortest 
timescales is a well-established empirical fact that has 
been observed in a very large number of markets, in- 



2008). As highlighted by Cont (2001 ), the absence of au- Stanley et al. 2008). More precisely, the square mid- 



price returns for S&P 500 index futures (Cont 20011, 



the NYSE ( Cont 2005), the US dollar /Yen currency pair 



(Cont et al. 1997) and crude oil futures (Zhao 2010) 



were all found to exhibit slowly-decaying positive auto- 
correlations at intra-day timescales, as were the absolute 



mid-price returns on the Paris Bourse ( Chakraborti et al. 



2011b) and the Shenzhen Stock Exchange (Gu and Zhou 



eluding the NYSE (Ai't-Sahaha et al. 2005 Cont 20051, 



Euronext (Chakraborti et al. 2011b), the US dollar/Yen 
and Pounds Sterling/US dollar currency pairs traded on 
foreign exchange markets ( [Bouchaud and Potters |2003| 



ters 



2003 Gopikrishnan et al. 1999), German interest 



and crude oil futures (Zhao 2010). 



2009a ) . The Hurst exponent H of the volatility series was 
reported to be i? « 0.8 on the Paris Bourse, H « 0.815 
for the US dollar/Yen currency pair, and H « 0.58 on 
the Shenzhen Stock Exchange. 

The time series constructed by assigning the value -1-1 
to incoming buy orders and —1 to incoming sell orders 
has been found to exhibit a long memory on the Paris 



Cont et al. 1997), the SfcP 500 (Bouchaud and Pot- Bourse (Bouchaud et al. 2004), the NYSE (Lillo and 



rates futures contracts (Bouchaud and Potters 2003), 



Farmer 



2004), and the Shenzhen Stock Exchange (Gu 



and Zhou 2009a I. On the LSE (Bouchaud et al. 2009 



Lillo and Farmer 2004 Mike and Farmer 2008 ) , the time 



The timescales over which the negative autocorrela- 
tion persists is less clear, however. Using data from 
1984-1996, the S&P500 exhibited negative autocorrela- 
tion in mid-price returns on timescales of up to around 
20 minutes (Gopikrishnan et al. 19991. However, us- 
ing data from 1991-2001, the S&P 500 showed nega- 
tive correlation up to 10 minutes, but not 15 minutes. 
Over the same date range, German interest rates futures 
contracts showed similar behaviour, whereas the Pounds 
Sterling/US dollar currency pair exhibited negative au- 



series constructed by assigning the value -1-1 to incoming 
buy market orders and — 1 to incoming sell market orders 
has been found to exhibit a long memory, as has the same 
series for buy or sell limit orders and for buy or sell active 
order cancellations. Statistically significant differences 
were reported between the values of the exponents for dif- 
ferent stocks. Volatility, n{b{t),t), and n{a{t),t) were all 
also found to exhibit long memory. The arrival of exter- 
nal news and the strategic splitting of orders (see Section 



IV. G) were both offered as potential causes of such long 



28 



memory. Zovko and Farmer ( 2002 1 also reported that on 



the LSE, the time series of relative prices of submitted 
limit orders was found to be a long-memory process, with 
Hurst exponent H = 0.8. A similar long-memory effect 



was reported on the Shenzhen Stock Exchange (Gu and 
Zhoul |2009a|), with Hurst exponent H = 0.62. 



As Lillo and Farmer (20041 discussed, it might be ex- 



pected that the various forms of long memory observed 
would lead to high levels of predictability in mid-price 
returns. However, anti-correlations were found to exist 
between the different long memory processes, thereby re- 
moving exploitable predictability in the mid-price return 
series. In particular, the long memory of market order 
arrivals was found to be offset by the long memory in 
n{b{t),t) and n{a{t),t), meaning that when predictabil- 
ity of market order arrivals was high, the probability 
that a buy (respectively, sell) market order would cause a 
change in m{t) was low, because \n{b{t),t)\ and n{a{t),t) 
were larger. Bouchaud et aL| (2004) offered an alterna- 
tive explanation for the absence of such predictability, 
instead suggesting that the long memory in the arrival of 
limit orders offset the long memory in the arrival of mar- 
ket orders, thereby again making large changes in m{t) 
unlikely. 



V. LIMIT ORDER BOOK MODELS 

During recent years, the economics and physics com- 
munities have both made substantial progress with LOB 



modelhng, as surveyed in ( Chakraborti ei a/. 2011a Par- 



lour and Seppi 20081. However, work by the two commu- 



nities has remained largely independent (Farmer et al. 



2005). Work by economists has tended to be "trader- 
centric" , using perfect-rationality frameworks to derive 
optimal trading strategies given certain market condi- 
tions. These models have generally treated order flow as 
static. By contrast, models from physics have tended to 
be conceptual toy models of the evolution of L{t). By 
relating changes in order flow to properties of the LOB, 



these models treat order flow as dynamic (Farmer et al. 



2005). Both approaches clearly have their strengths. An 



understanding of trading strategies is crucial for mar- 



ket participants and regulators of markets alike ( Alfonsi 



et al.\ |2010t |Almgren and Chriss 2001 


Cao et al.. 


2008 


Evans and Lyonsl 2002 Foucault et al. 


2005 Gat 


leral 


2010 Goett 


er et al. 2006| Hall and Hautsch| 2006 


Hol- 


lifield et al. 


2006| 


Ro§u 20101 Sandas 2001 Seppi 


1997 


Wyart et al. 


2008 


) . An understanding of the state of the 



LOB and order flow helps to explain many of the regu- 
larities found in empirical data, and gives some insight 
into whether such regularities might emerge as a conse- 
quence of market microstucture, rather than the strate- 



gic behaviour of those trading within it ( 


Bouchaud et al. 


2009 


Farmer et al. 


2005 Gu and Zhoul 


2009a Mike and 


Farmer 


2008 Smith et al. 


2003 


I. 



In this section, we assess existing LOB models in terms 
of their ability to accurately mimic the structure of limit 
order trading and reproduce empirical facts from LOB 



data (as discussed in Section IV I, and we highlight the 
main challenges and problems that are yet to be ad- 
dressed. 



A. Perfect-rationality approaches 

In the traditional economics approach, rational in- 
vestors faced with straightforward buy or sell possibil- 
ities choose portfolio strategies of holdings to maximize 
personal utility, subject to budget constraints. However, 
limit order markets provide a substantially more compli- 
cated scenario. Rather than submitting orders for ex- 
act quantities at exact prices, an investor calculating the 
ideal portfolio may attempt to construct this using both 
limit orders and market orders. Investors can be cer- 
tain about the state of their portfolio if they only submit 
market orders, but the inherent uncertainty of execution 
of limit orders adds substantially to the uncertainty of 
the evolving state space for all investors who choose to 
use them. Furthermore, for a market participant to suc- 
cessfully calculate fill probability distributions for limit 
orders, it is necessary to condition on L{t). If a mar- 
ket participant also believes that there are regularities 
present in order flow, these must also be conditioned on. 
However, in heavily-populated markets it is unlikely that 
any such regularities will persist, as they could provide 
statistical arbitrage opportunities if they did so. 



1. Cut-off strategies 

Many early perfect-rationality models aimed to ad- 
dress market participants' decision making via the use 
of a cut-off strategy, analogous to a hypothesis test in 
statistical inference: 

Definition. When attempting to choose between decision 
Di and decision D2 at time t, an individual employing a 
cut-off strategy compares the value of a statistic Z{t) G M 
with a cut-off point z G M, and makes the decision 

Di, ifZ<z 
D2, otherwise. 

The statistic Z{t) can be any statistic related to L{t), 
current or recent order flow, the actions of other market 
participants, and so on. For example, a market partic- 
ipant who wishes to place a buy order at time t might 
decide to submit a buy market order if s{t) is smaller 
than 5 5p, or to submit a buy limit order with a relative 
price of A = ~5p otherwise. 

Cut-off strategies are not sufficiently rich to model sit- 
uations where market participants choose outcomes from 



29 



a range of values, rather than a binary set. Neverthe- 
less, cut-off strategies have been used widely in perfect- 
rationality models, as they drastically reduce the dimen- 
sionality of the decision space available to market par- 
ticipants, which is very appealing from the standpoint of 
tractability. 

To our knowledge, the first model that addressed en- 
dogenous decision making between limit and market or- 
ders in a setting that resembled limit order trading was 
due to Chakravarty and Holden (1995). Trading was 



modelled as a single-period game. First, a market maker 
arrived and set quotes. Then, all other market partici- 
pants arrived simultaneously and chose between submit- 
ting limit or market orders, using a cut-off strategy based 
on the difference between their private valuation of the 
asset and the quotes set by the market maker. Finally, 
all trades were executed simultaneously using pro-rata 
priority (there is no concept of time priority in a single- 
period framework) . Although this single-step set-up fails 
to capture many crucial features of real limit order trad- 
ing and greatly simplifies the decision process facing mar- 
ket participants, it demonstrated that optimal strategies 
for informed traders could involve submitting either limit 
orders or market orders, depending on how the market 
maker acted. This in turn highlighted that endogenous 
order choice for market participants was a crucial feature 
of a successful LOB model. 



Foucault ( 1999 1 extended this idea by modelling trad- 



ing as a multi-step game in which market participants 
were assumed to arrive sequentially. Limit orders were 
assumed to remain in the LOB for only one period; if the 
next arriving market participant did not submit a market 
order to match to an existing limit order, it would expire 
and be removed from the LOB. Upon arrival, each market 
participant chose between placing a limit order or a mar- 
ket order, then left the market forever. After each such 
departure of a market participant, the game ended with 
some fixed probability; otherwise, a new market partici- 
pant arrived and the process repeated. Foucault showed 
that in such a game, the optimal trading strategy for a 
market participant was a cut-off strategy based on the 
market participant's private valuation of the asset and 
the price of the existing limit order (if one existed at all) . 
More precisely, market participants who observed that 
the LOB already contained an active buy (respectively, 
sell) order at a price above (respectively, below) their 
cut-off price should submit a market order to perform an 
immediate trade, otherwise they should submit a limit 
order instead. 

Foucault's model contains several assumptions that 
poorly mimic important aspects of real limit order trad- 
ing, including the assumptions that limit orders only last 
for a single period of time and that there is a random, 
exogenous stopping time governing trading. These as- 
sumptions restrict the model's ability to make realistic 
predictions about order flow dynamics, and, therefore. 



about how market participants estimate order fill prob- 
abilities when deciding how to act. However, Foucault's 
model highlighted how the probability that a submitted 
limit order becomes matched depended explicitly on fu- 
ture market participants' actions, which are themselves 
endogenous. Furthermore, it showed that in deriving 
their own optimal strategies, market participants must 
actively consider the strategies of the other market par- 
ticipants. 



Parlour (19981 extended Foucault's model by remov- 



ing the assumption that limit orders expired after a single 
period. Although Parlour's model only allowed limit or- 
ders to be submitted at one specific price and did not al- 
low limit order cancellations (thus greatly simplifying the 



trading process (HoUifield et al. 20061), the work identi 



fied explicit links between market participants' strategies 
and L{t). In particular. Parlour demonstrated that the 
optimal decision between submitting a limit or a market 
order should be made by employing a cut-off strategy, in 
which newly- arriving market participants assessed both 
sides of the LOB, not just the side that they would be 
directly affecting, in order to estimate the fill probabil- 
ity for a limit order. If the estimated fill probability was 
sufficiently high, the market participant should submit a 
limit order; otherwise submitting a market order. Par- 
lour argued that limit orders became less attractive later 
in the trading day due to their lower fill probabilities 
before the day ended. 



HoUifield et al. { 2004 ) empirically tested whether cut- 



off strategies for choosing between limit and market or- 
ders (such as those discussed above) could explain the 
observed actions of market participants trading the Er- 
icsson stock on the Stockholm Stock Exchange. Work- 
ing at the 1% level, the hypothesis was accepted when 
the bid side or the sell side of the LOB were each con- 
sidered in isolation but rejected when both sides of the 
LOB were considered together, due to the existence of 
several limit orders with extremely low fill probabilities 
whose expected payoff was too low for the model to jus- 
tify. HoUifield et al. concluded that cancellations, which 
were absent from the models discussed above, must play 
an important role in real LOBs. HoUifield et al. (2006) 



later produced a model with endogenous cancellations 
and concluded that optimal decisions were not made via a 
simple cut-off strategy, but rather depended on the evalu- 
ation of several interacting functions that varied between 
market participants according to their personal waiting- 
time penalty function. 



2. Fundamental values and informed traders 

Some perfect-rationality models centre around the idea 
that a subset of market participants are informed traders 
who know the "fundamental" or "true" value for the asset 
being traded; everyone else is uninformed and does not 



30 



know this true value (see e.g., ( 


Copeland and Gala 


i 1983| 


Glosten 


1994 


Glosten and Milgrom 


1985 


Kyle 


1985|)). 


Bouchaud et al. 


( 2009 1 noted that many researchers now 



reject the idea of assets having fundamental values, but 
such models do provide insight into price formation in 
markets with asymmetric information. 



In the classic Kyle (1985) model, uninformed traders 



placed limit and market orders to trade with each other. 
At the same time, informed traders observed the LOB 
and, if ever an uninformed trader posted a buy limit order 
with a price above (respectively, sell limit order with a 
price below) the fundamental value, an informed trader 
would submit a market order to "pick-off" the mispriced 
limit order and thereby make a profit. 

However, more recent models have highlighted several 
reasons that informed traders should sometimes choose 
to submit limit orders rather than market orders, for ex- 
ample to avoid detection by other market participants 
(who would surely mimic an informed trader whom they 
knew to be well-informed about the fundamental value of 
the asset, and thereby erode his/her profit opportunities 



(Ro§u 20101) and to obtain better prices for their trades 



(Chakravarty and Holden 1995 Rogu 2010) 



An example of such a model is Goettler et al. ( 2006 1 . 
In a limit order market populated by agents who act 
upon asymmetric information, market participants were 
assumed to arrive following a Poisson process. Upon ar- 
rival, a market participant submitted any desired orders, 
choosing freely among prices. The agent then left the 
market and re-arrived following an independent Poisson 
process. Upon rearrival, the agent had the option to 
cancel or modify any of their active orders. When a mar- 
ket participant performed a trade, they left the market 
forever. Additionally, any market participant could, at 
any time, pay a fee to become informed about the fun- 
damental value of the asset, and to stay informed un- 
til they eventually traded. Goettler et al. investigated 
when it was optimal for a market participant to purchase 
the information (if at all), and found that a market par- 
ticipant's willingness to do so should decrease as their 
desire to trade increased. They found that speculators, 
who trade purely for profit, should buy the information 
the most often, and that the value of the information 
increased with volatility. The optimal strategy for an in- 
formed market participant was found to contain submis- 
sions of both limit orders and market orders. However, 
as|Parlour and Seppi (2008| discussed, Goettler et al.'s 



step forward in realism came at the cost of discarding 
analytical tractability, forcing the authors to rely solely 
on numerical computations rather than closed-form ex- 
pressions. 



Ro§u (2010) also investigated how informed traders 
should optimally choose between limit orders and mar- 
ket orders. In the absence of cancellation costs and mon- 
itoring costs (so that all perfectly-rational market par- 
ticipants continuously monitored and actively updated 



all of their existing limit orders), it was shown that if 
b{t) or a{t) exhibited an extreme mispricing, an informed 
trader should submit a market order, to capitalize on 
the mispricing before any other informed market partic- 
ipants with the same information completed the trade 
first. However, if b{t) or a(t) exhibited a less extreme 
mispricing, a limit order should be submitted instead (to 
gain a better price for the trade, if it was matched) . The 
price impact of a single informed trader's order submis- 
sions was found to be insufficient to reset b{t) and a(t) to 
the fundamental levels as described by the information, 
so any subsequent informed market participants who ar- 
rived at the market with the same information were able 
to perform similar actions to also make a profit. Ro§u 
argued that this was a possible explanation for the em- 
pirically observed phenomenon of event clustering (as dis- 
cussed in Section 1V.F.3). 



Ro§u (2009) replaced the idea that market partici- 



pants who selected different prices for their orders must 



have done so due to asymmetric information (Glosten 



and Milgrom 1985 Kyle 1985 ) with the notion that dif 



ferent market participants might have selected different 
prices for their orders because they valued the immedi- 
acy of trading differently. For example, in real markets 
some market participants need to trade immediately and 
therefore submit a market order; others do not need to 
trade immediately and can therefore submit a limit or- 
der in the hope of eventually trading at a better price. 
In Ro§u's model, market participants could modify and 
cancel their active orders in real time, making it the first 
perfect-rationality LOB model to reflect the full range 
of actions available to market participants. Rather than 
complicating the model, Ro§u demonstrated that limit 
order cancellations simplified the decision-making prob- 
lem. He proved the existence of a unique Markov-perfect 
equilibrium in the game and derived the optimal strategy 
for a newly arriving market participant. Furthermore, he 
showed that in a LOB populated by market participants 
following such a strategy, a hump-shaped depth profile 
would emerge, in agreement with empirical findings from 



a number of different markets (see Section IV. D). 



3. Minimizing marl<et impact 



As discussed in Section IV. G determining how to min- 
imize the market impact of an order is a key considera- 
tion for market participants. Several perfect-rationality 
models have suggested that the event clustering found 



in empirical data (as outlined in Section IV. F. 3 ) might 
be a signature of market participants attempting to min- 
imize their market impact when executing large orders 



(Bouchaud et al. 2009). Lillo et al. (2005) showed that 



the power-law decaying autocorrelation function exhib- 
ited by order flows present in empirical data could be re- 
produced by a model in which market participants who 



31 



wished to buy or sell a large quantity of the asset did so 
by submitting a collection of smaller orders sequentially 
over some period of time. 



In a discrete time framework, Bertsimas and Lo ( 1998 ) 



derived an optimal trading strategy for a market partici- 
pant seeking to minimize expected trading costs, includ- 
ing those due to market impact, when processing a very 
large order that had to be completed in the next k time 
steps. They showed that if prices followed an arithmetic 
random walk, then the original order should be split into 
k equal blocks and submitted uniformly through time. 
Additionally, they showed that if prices reflected some 
form of exogenous information that was serially corre- 
lated through time, the optimal strategy involved dynam- 
ically adjusting trade quantities at every step. However, 
both of these assumptions about the behaviour of prices 
poorly mimic the structure of empirically observed price 



series (Lo and MacKinlay 2001). Almgren and Chriss 



(2001) derived a similar strategy for executing a large 



order, but instead by maximizing the utility of trading 
revenues including a component to penalize for uncer- 
tainty. 



Obizhaeva and Wang ( 2005 1 considered the optimal 
execution problem in continuous time. In this set-up, 
choosing optimal times to submit orders, not just their 
sizes, is crucial. The authors demonstrated that simply 
considering the limit fc — > oo of a fc-period discrete-time 
model was not appropriate, as it led to a degenerate so- 
lution where execution costs were strategy-independent. 
By making some strong assumptions about the LOB, in- 
cluding assuming that after the arrival of a market order, 
the depth profile underwent exponential recovery through 



timq [back to a neutral uniform state of n(p, t) = n{p' , t) 
for all prices Obizhaeva and Wang derived explicit 
optimal execution strategies and concluded that the the- 
oretical optimum required the submission of uncountably 
many orders over a finite time period. [Alfonsi et al.\ 
(2010) developed the model by removing the assumption 



that the "neutral" state of the depth profile must be uni- 
form, although recovery to the neutral state was still as- 
sumed to be exponential. In the absence of considerations 
of permanent impact, Alfonsi et al. showed that in dis- 
crete (respectively, continuous) time, the optimal execu- 
tion strategy involved initially submitting a large market 
order to "stimulate recovery" , small equally-sized market 
orders at each intermediate time step (respectively, at a 
fixed rate in continuous time), then another large mar- 
ket order at the end. When permanent impact was also 
considered, the problem was solved for the special case 
of a uniform neutral depth profile. 



Such recovery of the depth profile, often known as its resiliency, 
has been discussed in both the empirical (|Bi ais et al.\ |1995| 
[Bouchaud et a l. 2004 Potters and Bouchaud , 2003[l and mod- 
elling literature I ^Foucault et aL[|2005[ |Ro§U[ |2009jl7"^ 



B. Zero-intelligence approaches 

As noted above, most perfect-rationality models to 
date have relied on a series of atrxiliary assumptions 
in order to quantify unobservable parameters, making 
it difficult to relate their predictions to real limit order 
markets. By contrast, zero-intelligence models work in 
a purely quantitative framework, the general approach 
being to construct a stochastic model for observable pro- 
cesses such as order arrivals and cancellations, estimate 
its parameters from historical data, use it to produce sim- 
ulated output, and then test whether the output agrees 
with empirical regularities observed in real LOB data 
(such as the mean depth profile, the spread distribution. 



and the stylized facts discussed in Section IV.Hl. In such 



a framework, falsifiable hypotheses can be formulated 
and the predictive power of models can be measured by 
training them on a subset of available data ( "in-sample" ) . 
The model's output may then be evaluated against other 
data ( "out-of-sample" ) . 



1. Model framework 



Most zero-intelligence LOB models use the framework 



introduced by |Bak et al. ( 19971 to model the evolution of 
L{t). Orders are modelled as particles on a discrete one- 
dimensional lattice, whose locations correspond to price. 
Each particle corresponds to an order of size tr, so an 
order of size ka is modelled by k separate particles. Sell 
orders are represented as a particle of type A and buy 
orders are represented as a particle of type B. When two 
orders of opposite type occupy the same point on the 
pricing grid, the annihilation A + B ^ ^ occurrs. Figure 
[8] illustrates how a LOB is modelled in this way. 



2. Random-walk diffusion models 



Bak et al. (1997) introduced the earliest class of zero- 



intelligence LOB models, in which L{t) was modelled as 
particles diffusing along the price lattice. Given an ini- 
tial LOB state (with all A particles to the right of all 
B particles), each particle was assumed to undergo a 
random walk along the price lattice. If two particles 
of opposite type occupied the same point on the price 
lattice, an annihilation occurred. Initially, such models 
were studied analytically and via Monte Carlo simula- 



tion (Bak et al. 1997 Chan et al. 2001 Eliezer and Ko- 



gan 1998 Tang and Tian 1999), and produced several 



possible explanations for empirical regularities observed 
in real LOB data, such as the hump shaped depth pro- 
file, as discussed in Section |IV.D[ However, they have 
since been rejected by the modelling community because 
several subsequent studies have concluded that the dif- 
fusion of active orders across different prices is not ob- 



32 




1.29 1.30 Price 
(USD) 



S 5 

^ 3 
^ 2 

S. 1 
Q 

-1 
-2 
'3 
-4 



1.24 1.25 1.26 1.27 1.28 



B Particles 



A Particles 



1.29 1.30 Price 
(USD) 



FIG. 8 A LOB and its corresponding representation as a sys- 
tem of particles on a pricing lattice 



served in real LOBs (Chakraborti et al. 
[and Stinchcombe 2001 Farmer et al 



2011a Challet 



2005). Nonethe- 



less, these models sparked the idea that empirical regu- 
larities in real LOB data that were previously thought to 
be a direct consequence of market participants' strategic 
actions could be reproduced in a zero-intelligence frame- 
work. This idea has subsequently become a central theme 



of the zero-intelli 


gence modelling literature (Bouchaud 


et al. 


2009 


Farmer and FoleyH2009| |Farmer et al |2005| 


Smith et al. 


2003 


I. 



3. Discrete-time models 



Maslov (2000) introduced a model that bore a much 
stronger resemblance to real limit order trading than 
the price diffusion models discussed above. In Maslov's 
model, a single market participant was assumed to ar- 
rive at the limit order market at each discrete time step. 
With probability ^, this market participant was a buyer; 
otherwise, they were a seller. Independently of whether 
they were a buyer or a seller, with probability 1 — r the 
market participant submitted a market order; otherwise, 
they submitted a limit order x — [px, cr), with 



Px=P 



ttK 



where p' was the most recent price at which a matching 
had occurred, K was a random variable with a specified 
distribution, and 

1—1, if the trader was a buyer, 
TT = < 

1, if the trader was a seller. 



No cancellations or modifications to active orders were 
allowed. Even with only 1000 iterations and in very sim- 
ple set-ups (such as r = i and K — 1 with probability 
1; or r = i and K ^ Uniform {1, 2, 3, 4}), the series of 
prices at which trades occurred was found to exhibit neg- 
ative autocorrelation on event-by-event timescales and a 
heavy-tailed return distribution. Slanina (2001) showed 



that the negative autocorrelation and heavy-tailed return 
distribution remained when implementing a mean-field 
approximation to replace the tracking of prices of indi- 
vidual limit orders with that of a mean value that in- 
creased when a limit order arrived and decreased when 
a market order arrived. However, the key problem with 
this line of work was that the generated mid-price re- 
turn series exhibited a Hurst exponent of H « 0.25 on 
all timescales. By contrast, as discussed in Section p^V.H[ 
real LOB data displays no long memory in mid-price re- 
turns (i.e., H « 0.5). 

Challet and Stinchcombe (2001) refined Maslov's 



model by allowing multiple particles to be deposited on 
the pricing grid during a single timestep. They also al- 
lowed existing particles to evaporate, corresponding to 
the cancellation of an active order, although such evap- 
orations were assumed to occur exogenously and inde- 
pendently for each particle, and independently of L{t). 
In each time step, a random vector 7 = (71, . . . ,^a,^) 
was generated, where An was drawn from an exponen- 
tial distribution then rounded to the nearest integer and 
71 , . . . , were independent draws from a normal dis- 
tribution, again rounded to the nearest integer. An in- 
dependent random vector v = (i^i, . . . , vb,J) was gener- 
ated in the same way. For each 7^, i — 1,...,A„, an 
A particle was deposited on the pricing grid with ask- 
relative price 7^ ; for each Vj, j = 1 , . . . , i?m , a B par- 
ticle was deposited on the pricing grid with bid-relative 
price i^i. All A + B ^ (d annihilations occurred at prices 
with both A and B particles. Finally, all remaining par- 
ticles evaporated independently with fixed probability, 
then the whole process was repeated. Challet and Stinch- 
combe's model exhibited a heavy-tailed return distribu- 
tion and volatility clustering, and the Hurst exponent of 
the mid- price return series at large timescales was 0.5. 
The authors conjectured that it was the evaporations 



in their model, that had been absent in Maslov (20001, 
that made the Hurst exponent at large timescales match 
that of empirical data. However, at all shorter timescales 
Challet and Stinchcombe's model exhibited a Hurst ex- 
ponent iJ < i , which is inconsistent with empirical data 



(as discussed in Section IV. H ) 



4. Continuous-time models 

The generalization of zero-intelligence approaches to 
continuous time was due to a model first introduced by 



Daniels et al. ( 2002 ) and then developed by Smith et al. 



33 



(2003). Modelling market order arrivals and limit order 
arrivals and cancellations as independent Poisson pro- 
cesses, and modelling the relative price of an incoming 
limit order as being uniformly distributed on the semi- 
infinite interval (— s(i), cx)), the evolution of the LOB was 
described by a master equation. The master equation 
was solved under a mean-field approximation that the 
depth available at neighbouring prices was independent, 
in the limit Sp ^ (i.e., assuming that the pricing grid 
was continuous, not discrete). Guided by dimensional 
analysis, the authors constructed simple, closed-form es- 
timators for a variety of LOB properties, such as the 
mean spread, mean depth available at a given price, and 
mid-price diffusion, in terms of only the Poisson pro- 
cesses' arrival rates and the lot size a. Similar results 
were achieved using Monte Carlo simulations. The model 
also provided possible explanations for why some empiri- 
cal properties of LOBs varied across different markets (as 
discussed in Section IV). In particular, the lot size a ap- 



peared explicitly in many of the closed-form estimators 
derived, and "phase transitions" between fundamentally 
different types of market behaviour were observed to oc- 
cur as a varied. 



Many of the assumptions made by the Daniels et al. 



(2002) and Smith et al. (2003) model in order to main- 



tain analytical tractability result in a poor resemblance to 
some aspects of real limit order markets. For example, in 
the limit 5p — ?> 0, the only possible number of limit orders 
that can reside at a given price p is or 1. This is because 
the relative price of an incoming limit order is chosen 
from the continuous uniform distribution, and therefore 
the probability of an incoming limit order having exactly 
the same price as an active order is infinitesimal. This 
destroyed the notion of limit orders "queueing up" , and 
thus removed what is a primary consideration for market 
participants: when to submit an order at the back of an 
existing priority queue versus when to start a new queue 



at a worse price (Parlour and Seppi 2008). Additionally, 



by assuming that all the Poisson processes were indepen- 
dent of each other, the conditional structure of arrival 
rates known to exist in empirical data was discarded. 
But despite these simplifications, the model performed 
well when tested against some aspects of empirical data 
( Farmer et al. 2005 ) . Predictions of the mean spread s 



and a measure of price diffusion c p^ were made for 1 1 
stocks from the LSE by calibrating the model's param- 
eters using historical data. The empirical mean spread 
and price diffusion were then calculated directly from the 
data and compared these to the model's predictions us- 



Farmer et al. ] l |2005t studied price diffusion by calculating the 
variance Vt of the set {(miti + t) — m{ti)) \ j = 1, . . . ,n} for 
various values of t, where {ti \ i = 1, . . . ,n\ was the set of times 
at which the mid-price changed. They then performed a least- 
squares regression to estimate d in the expression = dr. 



ing an ordinary least-squares regression. In particular, a 
straight line was fitted to the relationship 

■^cmp(*) = zZ^o^(l) -|- C 

where Zc-mp{i) and .^mod(i) were the mean empirical 
and model output values of statistic Z for stock j, for 
i — I, ... ,11. Under this set-up, z = 1, c = would cor- 
respond to a perfect fit of the model to the data. For the 
mean spread, the ordinary least-squares estimates of the 
parameters were z = 0.99 ± 0.10 and c — 0.06 ± 0.26 
and for the price diffusion, the ordinary least-squares 
estimates of the parameters were z = 1.33 ± 0.10 and 
c = 0.06 ± 0.26. Bootstrap resampling was used to es- 
timate the standard errors, as serial correlations within 
the data invalidated the assumptions required to make 
such estimation under ordinary least-squares regression. 
The model was also found to predict different price dif- 
fusions over different timescales, in agreement with em- 
pirical data. However, the distribution of price returns 
predicted by the model was thin-tailed on all timescales, 
thus poorly fitting empirical data. 



Cont et al. 



Daniels et al. 



(20101 recently introduced a variant of the 



(2002) and Smith et al. (2003) model. Their 



main purpose was to study conditional (rather than equi- 
librium) behaviour: i.e., to understand how the frequency 
of occurrence of certain events was linked to L{t). The 
model did not make the assumption that Sp — )• 0, thus 
ensuring that priority queues formed at discrete points 
on the price lattice. The assumption of a uniform distri- 



bution of relative prices (as made by Daniels et al. ( 2002 ) 



and Smith et al. (2003)) was also removed, and replaced 



by a power-law distribution, as suggested by empirical 



data ( 


Bouchaud et al. 2002 


Cont et al.\ 


2010 


Potters 


and Bouchaud 


2003 


Zovko and Farmer 2002 


)• 


Simulations of the 


Cont et al. 


(2010 


1 model displayed 



the hump-shaped depth profile commonly reported in 



empirical data (see Section IV I. Using Laplace trans- 



forms, the authors computed conditional probability dis- 
tributions for several events, including an increase in m{t) 
at its next move, the matching of a limit order placed 
at b{t) before a{t) moved, and the matching of both a 
limit order at b{t) and a limit order at a{t) ("earning 
the spread") before m(t) moved. Comparing the model's 
predictions to empirical market data for a single stock 
traded on the Tokyo Stock Exchange revealed fair, but 
not strong, agreement. 



tended (Toke 2011 Zhao 



The Cont et al. (2010) model has recently been ex- 



2010) by revising the assumed 



arrival structure of market events. After conducting an 
empirical study of crude oil futures traded at the Inter- 



national Petroleum Exchange, Zhao (2010) rejected the 
assumption that the inter-arrival times of market events 
were independent draws from an exponential distribu- 
tion, and thereby rejected the use of independent Poisson 
processes to model market event arrivals. Zhao replaced 
the independent Poisson processes in Cont et al. 's model 



34 



with a Hawkes procesa ( Bauwens and Hautsch 2009) 



that described the arrival rate of all market events as a 
function of recent order arrival rates and of the number 
of arrivals that had recently occurred. When an arrival 
occurred, its type (e.g., market order arrival, limit or- 
der cancellation) was determined exogenously. This pro- 
duced order flows in which periods of high arrival rates 
clustered together in time, and periods of low arrival rates 
clustered together in time, in agreement with empirical 
data (EUul et al.\ 120031 IHall and Hautschl 120061). Zhao 



demonstrated that this modification to the model of Cont 
et al. resulted in an improved fit of the empirically ob- 



served mean relative depth profile. Toke (2011) similarly 



replaced the Poisson processes in Cont et al. 's model with 
Hawkes processes, but unlike Zhao, Toke used multiple 
mutually-exciting Hawkes processes, one for each type of 
market event. By studying empirical data from several 
different asset classes, Toke observed that once a market 
order had been placed, the mean time until the next limit 
order was placed was less than the corresponding uncon- 
ditional mean time. The use of Hawkes processes allowed 
a coupling between the arrival rates of limit orders and 
market orders, which produced simulated order flow that 
matched his empirical observations. The distribution of 
spreads generated by the Hawkes processes was closer to 
the true distribution of spreads than that generated by a 
Poisson process model. 

Cont and de Larrard (20111 recently introduced a 



model in which only n{b(t), t) and n{a{t), t) were tracked, 
rather than the whole depth profile. When the depth 
available at either of the prices became zero, they as- 
sumed that the depth available at the next best price 
was a random variable drawn from a distribution /. The 
state space of this model is N^, rather than as in most 
other recent LOB models. The justification for such a 
simplified set-up was that, in real time, many market par- 
ticipants only have access to the depths available at the 
best prices, not the whole depth profile (although this is 
increasingly less common as electronic trading platforms 
deliver ever more up-to-date information about the LOB 



in real time to market participants ( Boehmer et al. 2005 



Bortoli et al. 2006)). Market order arrivals, limit order 



arrivals, and limit order cancellations were assumed to be 
governed by independent Poisson processes. Analytical 
estimates for several market properties - such as volatil- 
ity, the distribution of time until the next change in m{t), 
the distribution and autocorrelation of price changes, and 
the conditional probability that m{t) moves in a speci- 
fied direction given the depths available at b{t) and a{t) 
- were deduced solely in terms of the Poisson processes' 



A Hawkes process is a point process with time-varying intensity 
parameter \{t) = Ao(i) -I- Y.ti<t Cje~^J , where the U 
denote the times of previous arrivals and the Cj and Dj are 
parameters controlUng the intensity of arrivals. 



rate parameters and the distribution /. Additionally, 
different levels of autocorrelation of the mid-price series 
were shown to emerge from the model at different sam- 
pling frequencies, in agreement with empirical observa- 



tions (Cont 2001 Zhou 1996). 



5. Beyond zero intelligence 

The models discussed so far in this section all attempt 
to maintain analytical tractability in a zero-intelligence 



framework. Mike and Farmer ( 2008 ) took a different ap- 
proach, and instead designed a zero-intelligence model 
that mimicked very closely the order flows they observed 
in an empirical study of data from the LSE. The model 
assumed that the relative price of each incoming order 
was drawn independently from a Student's t distribution, 
and closely matched cancellation rates for active orders 
to empirical data. 

For stocks with a small tick size and low volatility, 
the model was found to correctly exhibit negative au- 
tocorrelation of logarithmic mid-price returns on short 
timescales. Furthermore, the model was found to make 
good predictions of the distribution of mid-price returns, 
including heavy tails, and of s(t). This is a substantial 
improvement on previous models, whose predictions con- 
cerned only the mean values of such statistics rather than 
their whole distribution. However, for other stocks, the 
model was found to perform less well. 



Although Mike and Farmer ( 2008 ) did not assume that 



any market participants are rational, nor that they are 
attempting to maximize some personal utility by trading, 
the highly conditional structure of random variables in 
their model suggest ways in which observed regularities 
in order flow might be motivated by rational decision- 
making. For example, the existence of a higher can- 
cellation rates near b{t) and a(t) can be interpreted as 
the impatience felt by market participants who choose 
to submit limit orders at such aggressive prices in the 
flrst place. The lower rate of cancellation deeper inside 
the LOB reflects the fact that market participants would 
not submit such orders unless they were willing to wait 
patiently for them to be filled at some point in the future. 



Gu and Zhou (2009a) simulated the Mike and Farmer 



( 2008 ) model and performed a DFAm (see Section HI.M ) 



on the output mid-price return and volatility series. They 
found that neither series exhibited a long memory. For 
the mid-price return series, this is an accurate replica- 



tion of the stylized facts (see Section IV.H); whereas for 
the volatility series, this indicates that the model fails to 
capture the empirically observed long memory. Gu and 
Zhou then proposed an extension to the model, in which 
the relative prices were drawn from a Student's t dis- 
tribution with long memory, rather than independently. 
They found that such a modification caused the volatility 
series to exhibit long memory, in line with the stylized 



35 



facts, while still retaining all of the model's other results. 



Gu and Zhou (2009b) replaced the Student's t distri- 



bution for relative prices of incoming orders with several 
other distributions, and simulted the IMike and Farmer 



cess dynamics, with a diffusion coefficient that depended 
on the model's input parameters. Volatility dynamics 
were similarly derived. 

Early agent-based models of LOBs assumed that 



(2008) model to examine how this affected its output. agents arrived sequentially at the market ( Foucault ei aL 



They found that the empirically observed power-law tail 
in the mid-price return distribution only appeared in the 
model's output when the distribution from which positive 
relative prices were drawn had heavy tails, irrespective 
of whether the distribution from which negative relative 
prices were drawn had heavy tails or not. 



C. Agent-based models 

An agent-based model is a model in which a large num- 
ber of (possibly heterogeneous) agents, each with specific 
rules governing their behaviour, are assumed to inter- 
act in a specified way. Both the performance of indi- 
vidual agents and the aggregate effect of all agents in 
the system can be studied, either analytically, or via 
Monte Carlo simulation. By allowing each individual 
agent's behaviour to be specified without any explicit re- 
quirements regarding rationality in a given circumstance, 
agent-based models lie between the two extremes of zero- 
intelligence and perfect-rationality models. 



Chakraborti et al. (2011a) highlighted that a key ad- 



vantage of agent-based models in comparison to zero- 
intelligence and perfect-rationality models that aggregate 
across market participants is that heterogeneity between 
different market participants in real markets can be in- 
corporated directly. Models with independent and iden- 
tically distributed order flows can then be considered as 
special cases of agent-based models in which there is a 
single representative agent in the market and of which 
all other market participants are a perfect clone. 

However, agent-based models of limit order markets 
also have drawbacks. Due to the large number of in- 
teracting components in a LOB, agent-based modelling 
makes it difficult to track explicitly how a specified input 
parameter affects the output. It is also very difficult to 
encode quantitatively the complex and interacting strate- 
gies of market participants in a real limit order market 
into a set of rules governing an agent in an agent-based 
model, and finding a set of agent rules that produces 
a specific behaviour from the model provides no guar- 
antee that such a set of rules is the only one to do so 



( [Preis et a/.[|2007[ ). [Abergel and Jedidi| ( |2011[ ) attempted 
to address these issues by explicitly deriving systems of 
stochastic differential equations that described the order 
flows arising from specified agent-based models. Such 
equations could then be studied analytically, and equa- 
tions for the price dynamics were derived in terms of the 
agent-based model's input parameters, thereby demon- 
strating the exact links between the two. For example, a 
very simple model was shown to result in Gaussian pro- 



2005), and that the LOB emptied at the end of every 



time step. Such a set-up failed to acknowledge the LOB's 
key functionality of storing supply and demand for later 



consumption by other market participants ( Smith et al. 



2003). However, more recent agent-based models have 



more closely mimicked the process of real limit order 
trading, and have been able to reproduce a wide range 



of empirical features present in LOB data (Challet and 
Stinchco mbel [ 2003) [Chiarella and lorH |2002[ |Cont "mI? 



Bouchaud, 2000 Preis et al. 
J. 



20061. 



Cont and Bouchaud (20001 showed that when agents 



were assumed to imitate each other, a heavy-tailed return 
distribution emerged. The model's output also exhibited 
clustered volatility and aggregational Gaussianity, as dis- 
cussed in Section ITV. HI 



Chiarella and lori ( 2002 ) noted that if all agents were 



assumed to share a common valuation regime for the as- 
set being traded, the realized volatility was too low com- 
pared to empirical data and there was no volatility clus- 
tering. They thereby argued that substantial heterogene- 
ity must exist between market participants in order for 
the highly non-trivial properties of volatility, as discussed 



in Section III.H to emerge in real limit order markets. 



Cont (20051 noted that differences in agents' timescales 



(i.e., their level of impatience) could be a source of such 
heterogeneity in real markets. 



Preis et al. (2006) reproduced the main findings of 



the Smith et al. ( 2003 1 model, but using an agent-based 
model rather than independent Poisson processes. By 
finely tuning agents' trading strategies, the heavy-tailed 
distribution of mid-price returns, the super-diffusivity 
of mid-price returns over medium timescales, and the 
negative autocorrelation of m{t) on an event-by-event 
timescale were all reproduced by the model. The per- 
formance of individual agents in the model has also been 
studied (Preis et al. 2007). The Hurst exponent H of 



the mid-price series was found to vary according to the 
number of agents in the model. The best fit of H against 
values calculated from empirical data was found to occur 
with 150 to 500 "liquidity provider" (i.e., limit order plac- 
ing) agents and 150 to 500 "liquidity taker" (i.e., market 
order placing) agents in the model. 



Challet and Stinchcombe ( 2003 ) highlighted that many 



early LOB models assumed that model parameters (such 
as the arrival rate of new limit orders) were constant 
through time. They therefore produced an agent-based 
model with parameters that varied over time, and com- 
pared its output to a version of the same model where 
such parameters were instead held fixed. They found that 
allowing the parameters to vary resulted in the emergence 
of a heavy-tailed distribution of mid-price movements. 



36 



autocorrelated mid-price returns, and volatility cluster- 
ing^ 



Lillo (2007) showed how an agent-based model could 



explain the empirically observed power-law distribution 



of relative prices of incoming orders (see Section IV.B). 
In particular, he solved a utility maximization problem 
to show that if mid-price movements were assumed to fol- 
lowed a Brownian motion, each perfectly rational agent 
should choose the relative price of their submitted orders 
to be: 



a: = V2g-\a)VT'^ 



(9) 



where g{a) describes the individual agent's risk aversion, 
T is the individual agent's maximum time horizon (i.e., 
the maximum length of time that the agent is willing 
to wait before performing the trade), and V is the mar- 
ket volatility. He then studied how empirically observed 
homogeneity in g and T and empirically observed fluc- 
tuations in V affected the relative price choices of many 
interacting agents with different risk aversions g and dif- 
ferent maximum time horizons T. He found that hetero- 
geneity in T was the most likely source of the power-law 
tails in the distribution of A^^ , and that the homogeneity 
in g and fluctuations in V that were empirically observed 
in a wide range of markets were unlikely to lead to a 
power-law tail in the distribution of A^.. 



VI. DISCUSSION 



Many authors (e.g. ( |Gu and Zhoul |2009a[ |Lillo[ |2007 

Stanley et al. [2008 )) agree that one of the main chal- 
lenges facing researchers of LOBs today is to understand 
these stylized facts better. LOB models can help us to 
achieve this, and some progress has been made. However, 
no single model has yet been capable of simultaneously 
reproducing all of the stylized facts, nor is there a clear 
picture about precisely how the stylized facts emerge as 
a consequence of the actions of many heterogeneous mar- 
ket participants. This continues to be an active area of 
research. 



Approximate total number of days' data 
examined by empirical studies in Table 2 

o 
o 
in 



■o 



E 



o 
o 

IT) 



O 
O 
IT) 



1990 



1995 



2000 



Year 



2005 



T 
2010 



FIG. 9 Number of total days each year examined by empirical 
studies listed in Appendix [A] 



To date, no one has formulated a market mechanism 
design problem to which the LOB is known to be the 



optimal solution (Parlour and Seppi 20081. However, by 



giving every market participant the freedom to evaluate 
their own need for immediate liquidity, LOBs have rev- 
olutionized the process of trading. The rewards (and, 
indeed, the risks) associated with patience are now avail- 
able to all, rather than being reserved for a small number 
of market makers. New trading strategies have emerged 



from limit order trading ( 


Bertsimas and Lo 


1998 


Bi- 


ais et al. 


1995 


Obizhaeva and Wang 


2005 1 , and market 



participants must now continually consider the trade-off 
between submitting orders that match at a better price 
and submitting orders that match in a shorter time. 

Empirical studies and theoretical models have deep- 
ened understanding of specific aspects of limit order trad- 
ing. However, as we have discussed at length, a key unre- 
solved question is how the various pieces of the puzzle fit 
together. For example, models that accurately capture 
the dynamics of the price process on an event-by-event 
timescale poorly reproduce price dynamics on inter-day 
timescales. Similarly, models that explain price dynamics 
on inter-day timescales offer little understanding of how 
such dynamics are motivated by the LOB microstructure. 

As discussed in Section [lV.H[ several stylized facts have 
been empirically observed in a wide range of markets. 



A great deal of effort has been invested in the study 
of empirical LOB data. Figure [9] displays a plot of the 
approximate number of days' data per year that stud- 
ies discussed in this article have examined, as described 
in Appendix |Xj Although the breadth of such empirical 
work is substantial, the overwhelming picture painted by 
Figure [9] is that the data studied is old. Often, it is also 
of poor quality. Strong assertions have been made in 
empirical studies based on single stocks over very short 
time periods. The data studied rarely includes all or- 
der flows at all prices, so extensive auxiliary assumptions 
are often required before any statistical analysis can even 
begin. Additionally, markets change over time, so empir- 
ical observations from more than a decade ago might not 
accurately describe current LOB activity. 

There are also substantial challenges associated with 
studying historical limit order book data. Several LOB 
properties are believed to exhibit a long memory, al- 
though directly testing this hypothesis on a finite data 
set is a difficult task. Estimating the Hurst exponent 
H is similarly problematic, and several empirical stud- 
ies contain systematic errors in their calculations. This 
makes quantitative comparisons between potential long- 
memory processes very difficult, and it is unclear whether 
the differences in the reported estimates of Hurst expo- 
nents are really a result of differences in different markets. 



37 



or simply a result of differences in methodology when per- 
forming such estimation. Furthermore, such long-range 
correlations make it difficult to estimate confidence in- 
tervals for several LOB statistics, as the effective sample 
size of observations is far smaller than the number of 
data points. Studies of recent, high-quality LOB data 
that are conducted with stringent awareness of these po- 
tential statistical pitfalls are needed to understand better 
the LOBs of today. 

Direct comparisons between different empirical stud- 
ies are also problematic. The behaviour and strategies of 
market participants may have changed over time, mean- 
ing that the date over which data was collected might 
itself play a role in the statistics observed within it. This 
is particularly important with the recent surge in pop- 
ularity of electronic trading algorithms, which are able 
to process data to interpret changing market conditions, 
then to submit or cancel orders accordingly, in a frac- 
tion of the time that it would take a human to perform 
the same task. However, the date range of a study is far 
from being the only factor determining the statistical be- 
haviour of empirical data. The sampling frequency, asset 
class studied, LOB resolution parameters, specific trade- 
matching nuances, and many other factors all influence 
empirical data, but all vary from study to study. This 
makes it difficult to address questions about how alter- 
ing any one of these factors in isolation might change the 
observed statistics. 

Another aspect that is clear from empirical studies is 
how poorly the data supports the very strong assump- 
tions made by many LOB models. Although every model 
must make assumptions to facilitate computation, many 
LOB models have been built on elaborate and inaccu- 
rate assumptions that make it almost impossible to re- 
late their output to real LOBs. Indeed, this is a common 
problem when using purely statistical models to probe 
LOB data, as it is often unclear how aspects of the mod- 
els relate to specific elements of a LOB. 

Although precisely what is meant by "equilibrium" de- 
pendends upon context, almost all LOB models to date 
have focused on some form of equilibrium, such as a 
Markov-perfect equilibrium in sequential-game models 
or a state-space equilibrium in reaction-diffusion mod- 
els. However, overwhelming empirical evidence suggests 
that LOBs are subjected to constant shocks and there- 
fore always display transient behaviour. Early work on 
models that are not in equilibrium has hinted at promis- 



ing results (Challet and Stinchcombe 20031, but more 



work in this area is needed. 

Both perfect-rationality and zero-intelligence ap- 
proaches to LOB modelling have proven useful in en- 
abling analytic computation and numerical simulation, 
yet both require strong assumptions that are not justi- 
fied by data. Agent-based models appear to offer some 
compromise between the two extremes. They can op- 
erate within a zero-intelligence framework (if no ratio- 



nal decision-making is programmed into agents' spec- 
ifications); a perfectly-rational framework (if rational 
decision-making based on the maximization of a util- 
ity function is the only specification driving agents' be- 
haviour); or, crucially, they can reside somewhere be- 
tween. Furthermore, the level of game-theoretic con- 
siderations involved in agents' decision-making can also 
be controlled by specifying how strongly agents react to 
each other and forecast each other's actions. Therefore, 
agent-based models have the potential to provide a rich 
toolbox for investigating LOBs without a requirement for 
extreme modelling assumptions. However, it remains un- 
clear whether agent-based models of limit order trading 
really offer a deeper understanding of market dynamics or 
merely amount to curve-fitting exercises in which param- 
eters are varied until some form of non-trivial behaviour 
emerges. Nevertheless, recent developments suggest that 
the performance of LOB models can be improved by re- 
moving the inherent homogeneity associated with many 



zero- intelligence approaches (Toke 2011 Zhao 20101. 



Consequently, the heterogeneity offered by agent-based 
models might pave the way for new explanations of LOB 
phenomena. 

Price changes and volatility are among the most hotly 



debated topics in limit order markets (Almgren and 



ChrissI [2001] [Bouchaud et al.[ [20091 |Hasbrouck[ [1991 



Potters and Bouchaud , 2003 ) . What causes volatility to 



vary over time? Why should periods of high activity clus- 
ter together in time? Why should price fluctuations be 
so frequent (and so large) on intra-day timescales, given 
that external news events occur so rarely (Maslov 2000 1? 



It is not even agreed whether the number of market or- 



lant et al. 1992), or liquidity fluctuations in the LOB 



(Bouchaud et al 



ders (Jones et al. 1994), the size of market orders (Gal 



2009 ) play the dominant role in deter- 



mining volatility. It seems likely that the answers to such 
questions will not be found in isolation, but rather that 
there is an intricate interplay between the many parts of 
the "volatility puzzle". Recent work has attempted to 
tie some of these ideas together. Specifically, [Bouchaud 



et al. (20091 conjectured that volatility might be better 



understood by considering the need for market partici- 
pants to minimize their market impact. In particular, 
although news events happen rarely, when they do oc- 
cur they cause market participants to want to buy or sell 
very large quantities of the asset being traded. Market 
participants understand that they cannot simply perform 
a large trade immediately, as the market impact of their 
actions would cause them to trade at very unfavourable 
prices. Instead, they break up large trades into smaller 
chunks that are then gradually processed over weeks or 
even months. Due to their differing needs for immediacy, 
or indeed their differing reaction to the released news, 
different market participants choose different times for 
the submission of such chunks, causing a cascading of 
the original news event to the submission of multiple dif- 



38 



ferent orders at multiple different times. It will be in- 
teresting to observe whether this explanation withstands 
closer examination. 

Price impact and market impact also continue to be 
active areas of research. Indeed, a deeper understand- 
ing of these notions is very desirable, as they form a 
conceptual bridge between the microstructure mechan- 
ics of order matchings and the macroeconomic concepts 
of price formation. Considerations about price impact 
and market impact could also help to explain the actions 
of market participants in certain situations. For exam- 



ple, Wyart et al. (2008) conjectured that the empirically 
observed cross-correlation between volatility and the rel- 
ative price of incoming limit orders might be a result 
of market participants carefully managing their market 
impact. Gatheral (2010) has shown that if the instanta- 



neous mid-price impact function is nonlinear in market 
order size ujx ~ and the empirical evidence certainly sug- 



gests this is the case ( 


Hasbrouck 1991 


Lillo et al. 2003 


Potters and Bouchaud 


2003 


) - then it is possible to de- 



duce bounds on the way that the LOB must repopulate 
if arbitrage opportunities are to be excluded. However, 
despite the striking regularities that have emerged from 
empirical studies, little is understood about why the func- 
tional forms of price impact functions are what they are, 
and almost nothing is understood about market impact. 
Additionally, L{t) clearly plays an important role in de- 
termining the price and market impacts of an action, and 



it has recently been observed (Cont et al. 2011) that 
tracking only the mean impact of individual market or- 
ders might be insufficient to gain clear insight into price 
impact and market impact. To remedy this problem, it 
has been proposed that price impact should be studied 
not only as a function of arriving order size (or imbal- 
ance) but also as a function of L{t) at the time of order 
submissions (Cristelli et al. 2010). However, the curse of 



dimensionality poses substantial problems, as the state 
space of L{t) is so large. 

As ever more electronic limit order trading platforms 
have become available, it has become increasingly com- 
mon for specific assets to be traded on multiple electronic 
LOBs simultaneously. This poses a problem for empirical 
research, as the study of any individual LOB in isolation 
no longer provides a snapshot of the "whole" limit order 
market for the asset. Furthermore, differences in match- 
ing rules and transaction costs across different trading 
platforms make it difficult to directly compare different 
LOBs. Encouragingly, recent work has found similar be- 
haviours when studying different LOBs that traded the 



same asset simultaneously (Cont et al. 2011), but there 



is no reason to assume that this will always be the case. 
It is beneficial for market participants to have the option 
of trading the same asset on multiple platforms, as com- 
petition between different exchanges drives technological 
innovation and reduces the amount of market downtime. 
However, understanding how to assimilate data across 



multiple platforms will be of paramount importance in 
future studies. 

Finally, in addition to being a popular trade-matching 
algorithm that offers market participants greater choice 
and flexibility than ever before, LOBs are a rich and 
exciting testing ground for theories. Both empirical 
data and LOB models have provided new insight into 
longstanding economomic questions such as market ef- 
ficiency, price formation, and the rationality of market 
participants. Furthermore, LOBs are a classic example 
of a complex system. Despite the deceptively simple 
rules governing trade, several hallmarks of complexity, 
including nonlinear feedback, scaling, and universality, 
are present in LOBs. Both the quality and the quality 
of LOB data that is available far exceeds that of many 
other studied complex systems. It will be interesting to 
see what new insights into not just trading, but complex 
systems as a whole, the study of LOBs is able to provide 
in the future. 



ACKNOWLEDGMENTS 

We would like to thank Bruno Biais, Jean-Philippe 
Bouchaud, J. Doyne Farmer, Gabriele La Spada, Sergei 
Maslov, Stephen Roberts, Torsten Schoneborn, Cosma 
Shalizi, Neil Shephard, D. Eric Smith, Jonathan Tse, 
Thaleia Zariphopoulou, and Wei-Xing Zhou for useful 
discussions. MDG would like to thank EPSRC (In- 
dustrial CASE Award 08001834), HSBC Bank, and the 
Oxford-Man Institute of Quantitative Finance for sup- 
porting this work. 



REFERENCES 

Abergel, F., and A. Jedidi (2011), "A mathematical approach 
to order book modelling," in Econophysics of Order-driven 
Markets: Proceedings of Econophys-Kolkata V, edited by 
F. Abergel, B. K. Chakrabarti, C. A., and M. M. (Springer, 
Milan) pp. 93-107. 

Ai't-Sahalia, Y., P. Mykland, and L. Zhang (2005), 
"Ultra high frequency volatility estimation with de- 
pendent microstructure noise," University of Chicago 
Preprint, available at galton.uchicago.edu/~mykland/ 



paperlinks/depnoise .pdf 

Alfonsi, A., A. Fruth, and A. Schied (2010), "Optimal ex- 
ecution strategies in limit order books with general shape 
functions," Quantitative Finance 10 (2), 143. 

Almgren, R., and N. Chriss (2001), "Optimal execution of 
portfolio transactions," Journal of Risk 3 (2), 5. 

Anand, A., S. Chakravarty, and T. Martell (2005), "Empiri- 
cal evidence on the evolution of liquidity: choice of market 
versus limit orders by informed and uninformed traders," 
Journal of Financial Markets 8 (3), 288. 

Andersen, T. G., and V. Todorov (2010), "Reahzed volatility 
and multipower variation," in Encyclopedia of Quantitative 
Finance, edited by R. Cont (Wiley) pp. 1494-1500. 

Bak, P., M. Paczuski, and M. Shubik (1997), "Price variations 



39 



in a stock market with many agents," Piiysica A 246 (3-4), 
430. 

Bandi, F., and J. Russell (2006), "Separating microstruc- 
ture noise from volatility," Journal of Financial Economics 
79 (3), 655. 

Barndorff-Nielsen, O. E., and N. Shephard (2010), "Volatil- 
ity," in Encyclopedia of Quantitative Finance, edited by 
R. Cont (Wiley) pp. 1898-1901. 

Bauwens, L., and N. Hautsch (2009), "Modelling financial 
high frequency data using point processes," in Handbook of 
Financial Time Series, edited by T. G. Andersen, R. A. 
Davis, J. P. Kreiss, and T. Mikosch (Springer, Berlin) pp. 
953-979. 

Beran, J. (1994), Statistics for long-memory processes. Vol. 61 
(Chapman & Hall). 

Bertsimas, D., and A. Lo (1998), "Optimal control of execu- 
tion costs," Journal of Financial Markets 1 (1), 1. 

Biais, B., P. Million, and C. Spatt (1995), "An empirical 
analysis of the limit order book and the order flow in the 
Paris Bourse," The Journal of Finance 50 (5), 1655. 

Biais, B., P. Hillion, and C. Spatt (1999), "Price discov- 
ery and learning during the preopening period in the Paris 
Bourse," Journal of Political Economy 107 (6), 1218. 

Boehmer, E., G. Saar, and L. Yu (2005), "Lifting the veil: 
an analysis of pre-trade transparency at the NYSE," The 
Journal of Finance 60 (2), 783. 

Bortoli, L., A. Frino, E. Jarnecic, and D. Johnstone (2006), 
"Limit order book transparency, execution risk, and market 
liquidity: evidence from the Sydney Futures Exchange," 
Journal of Futures Markets 26 (12), 1147. 

Bouchaud, J. P., J. D. Farmer, and F. Lillo (2009), "How 
markets slowly digest changes in supply and demand," 
(North-Holland, San Diego) pp. 57-160. 

Bouchaud, J. P., Y. Gefen, M. Potters, and M. Wyart (2004), 
"Fluctuations and response in financial markets: the subtle 
nature of 'random' price changes," Quantitative Finance 
4 (2), 176. 

Bouchaud, J. P., M. Mezard, and M. Potters (2002), "Statis- 
tical properties of stock order books: empirical results and 
models," Quantitative Finance 2 (4), 251. 

Bouchaud, J. P., and M. Potters (2003), Theory of financial 
risk and derivative pricing: from statistical physics to risk 
management (Cambridge University Press). 

Cao, C, O. Hansch, and X. Wang (2008), "Order placement 
strategies in a pure limit order book market," Journal of 
Financial Research 31 (2), 113. 

Carrie, C. (2006), "The new electronic trading regime of dark 
books, mashups and algorithmic trading," in Algorithmic 
Trading II: Precision, Control, Execution, edited by B. R. 
Bruce (Institutional Investor Journals) pp. 14-20. 

Chakraborti, A., I. M. Toke, M. Patriarca, and F. Abergel 
(2011a), "Econophysics: agent-based models," Quantita- 
tive Finance 11 (7), 1013. 

Chakraborti, A., I. M. Toke, M. Patriarca, and F. Abergel 
(2011b), "Econophysics: empirical facts," Quantitative Fi- 
nance 11 (7), 991. 

Chakravarty, S., and C. W. Holden (1995), "An integrated 
model of market and limit orders," Journal of Financial 
Intermediation 4 (3), 213. 

Challet, D., and R. Stinchcombe (2001), "Analyzing and 
modehng 1 + Id markets," Physica A 300 (1-2), 285. 

Challet, D., and R. Stinchcombe (2003), "Non-constant rates 
and over-diffusive prices in a simple model of limit order 
markets," Quantitative Finance 3 (3), 155. 



Chan, D. L. C, D. Eliezer, and I. I. Kogan (2001), "Numer- 
ical analysis of the minimal and two-liquid models of the 
market microstructure," arXiv:0101474. 

Chiarella, C, and G. fori (2002), "A simulation analysis of 
the microstructure of double auction markets," Quantita- 
tive Finance 2 (5), 346. 

Clauset, A., C. R. Shalizi, and M. E. J. Newman (2009), 
"Power-law distributions in empirical data," SIAM Review 
51 (4), 661. 

Cont, R. (2001), "Empirical properties of asset returns: styl- 
ized facts and statistical issues," Quantitative Finance 
1 (2), 223. 

Cont, R. (2005), "Long range dependence in financial mar- 
kets," in Fractals in Engineering, edited by J. Levy-Vehel 
and E. Lutton (Springer, London). 

Cont, R., and J. Bouchaud (2000), "Herd behavior and ag- 
gregate fluctuations in financial markets," Macroeconomic 
Dynamics 4 (2), 170. 

Cont, R., A. Kukanov, and S. Stoikov (2011), "The price 
impact of order book events," arXiv:1011.6402. 

Cont, R., and A. de Larrard (2011), "Price dynamics in a 
Markovian limit order market," arXiv:1104.4596. 

Cont, R., M. Potters, and J. P. Bouchaud (1997), "Scahng 
in stock market data: stable laws and beyond," in Scale 
invariance and beyond: Les Houches Workshop, March 10- 
U, 1997, edited by B. Dubrulle, F. Graner, and D. Sor- 
nette (Springer). 

Cont, R., S. Stoikov, and R. Talreja (2010), "A stochas- 
tic model for order book dynamics," Operations Research 
58 (3), 549. 

Copeland, T. E., and D. Galai (1983), "Information effects 
on the bid-ask spread," Journal of Finance 38 (5), 1457. 

CristeUi, M., V. Alfi, L. Pietronero, and A. Zaccaria (2010), 
"Liquidity crisis, granularity of the order book and price 
fluctuations," The European Physical Journal B 73 (1), 
41. 

Daniels, M. G., J. D. Farmer, L. Gillemot, G. fori, and 

E. Smith (2002), "A quantitative model of trading and price 

formation in financial markets," arXiv:01 12422. 
Drozdz, S., M. Forczek, J. Kwapien, P. Oswi§cimka, and 

R. Rak (2007), "Stock market return distributions: From 

past to present," Physica A 383 (1), 59. 
Dufour, A., and R. F. Engle (2000), "Time and the price 

impact of a trade," The Journal of Finance 55 (6), 2467. 
Eisler, Z., J. P. Bouchaud, and J. Kockelkoren (2010), "The 

price impact of order book events: market orders, limit 

orders and cancellations," arXiv:0904.0900. 
Eliezer, D., and I. I. Kogan (1998), "Scaling laws for the 

market microstructure of the interdealer broker markets," 

arXiv:9808240. 

EUul, A., C. W. Holden, P. Jain, and R. Jennings 
(2003), "Determinants of order choice on the New 
York Stock Exchange," Indiana University Preprint, 
available at http : // citeseerx . ist .psu. edu/viewdoc7| 
summary?doi=10. 1 . 1 . 114.6359 

Engle, R. F., and A. J. Patton (2004), "Impacts of trades 
in an error-correction model of quote prices," Journal of 
Financial Markets 7 (1), 1. 

Evans, M. D. D., and R. K. Lyons (2002), "Order flow 
and exchange rate dynamics," Journal of Political Econ- 
omy 110 (1), 170. 

Farmer, J. D., and D. Foley (2009), "The economy needs 
agent-based modeUing," Nature 460 (7256), 685. 

Farmer, J. D., and F. Lillo (2004), "On the origin of power- 



40 



law tails in price fluctuations," Quantitative Finance 4 (1), 
7. 

Farmer, J. D., P. Patelli, and I. I. Zovko (2005), "The predic- 
tive power of zero intelligence in financial markets," Pro- 
ceedings of the National Academy of Sciences of the United 
States of America 102 (6), 2254. 

Field, J., and J. Large (2008), "Pro-rata matching and 
one-tick futures markets," University of Oxford Preprint, 
available at http : //www. economic s . ox. ac .uk/members/^ 

I j er emy . large/ProRataAprilOS . pdf 

Foucault, T. (1999), "Order flow composition and trading 
costs in a dynamic limit order market," Journal of Finan- 
cial Markets 2 (2), 99. 

Foucault, T., O. Kadan, and E. Kandel (2005), "Limit order 
book as a market for liquidity," Review of Financial Studies 
18 (4), 1171. 

Friedman, D. (2005), "The double auction market institu- 
tion: a survey," in The Double Auction Market: Institu- 
tions, Theories, and Evidence, edited by D. Friedman and 
J. Rust ( Addison- Wesley ) . 

Gabaix, X., P. Gopikrishnan, V. Plerou, and H. E. Stanley 
(2006), "Institutional investors and stock market volatil- 
ity," Quarterly Journal of Economics 121 (2), 461. 

Gallant, A. R., P. E. Rossi, and G. Tauchen (1992), "Stock 
prices and volume," Review of Financial studies 5 (2), 199. 

Gatheral, J. (2010), "No-dynamic-arbitrage and market im- 
pact," Quantitative Finance 10 (7), 749. 

Geweke, J., and S. Porter-Hudak (1983), "The estimation and 
application of long-memory time-series models," Journal of 
Time Series Analysis 4 (4), 221. 

Glosten, L. R. (1994), "Is the electronic open limit order book 
inevitable?" The Journal of Finance 49 (4), 1127. 

Glosten, L. R., and P. R. Milgrom (1985), "Bid, ask and 
transaction prices in a specialist market with heteroge- 
neously informed traders," Journal of Financial Economics 
14 (1), 71. 

Gode, D. K., and S. Sunder (1993), "AUocative efficiency of 
markets with zero-intelligence traders: market as a partial 
substitute for individual rationality," Journal of Political 
Economy 101 (1), 119. 

Goettler, R., C. Parlour, and U. Rajan (2006), "Microstruc- 
ture effects and asset pricing," Preprint, available at jhttp : | 

P //en. scientif iccommons . org/33345856 

Gopikrishnan, P., M. Meyer, L. A. N. Amaral, and H. E. 
Stanley (1998), "Inverse cubic law for the distribution of 
stock price variations," The European Physical Journal B- 
Condensed Matter and Complex Systems 3 (2), 139. 

Gopikrishnan, P., V. Plerou, L. A. N. Amaral, M. Meyer, 
and H. E. Stanley (1999), "Scaling of the distribution of 
fluctuations of financial market indices," Physical Review 
E 60 (5), 5305. 

Gopikrishnan, P., V. Plerou, X. Gabaix, and H. E. Stanley 
(2000), "Statistical properties of share volume traded in 
financial markets," Physical Review E 62 (4), 4493. 

Gould, M. D., M. A. Porter, S. WiUiams, M. McDonald, D. J. 
Fenn, and S. D. Howison (2011), "Statistical properties of 
foreign exchange limit order books," Working paper. 

Grossman, S. J., and J. E. Stiglitz (1980), "On the impossi- 
bility of informationally efficient markets," The American 
Economic Review 70 (3), 393. 

Gu, G. F., W. Chen, and W. X. Zhou (2008a), "Empirical dis- 
tributions of Chinese stock returns at different microscopic 
timescales," Physica A 387, 495. 

Gu, G. F., W. Chen, and W. X. Zhou (2008b), "Empirical 



regularities of order placement in the Chinese stock mar- 
ket," Physica A 387 (13), 3173. 
Gu, G. F., W. Chen, and W. X. Zhou (2008c), "Empirical 

shape function of limit-order books in the Chinese stock 

market," Physica A 387 (21), 5182. 
Gu, G. F., and W. X. Zhou (2009a), "Emergence of long 

memory in stock volatility from a modified Mike-Farmer 

model," Europhysics Letters 86, 48002. 
Gu, G. F., and W. X. Zhou (2009b), "On the probability 

distribution of stock returns in the mike-farmer model," 

European Physical Journal B 67 (4), 585. 
GuiUaume, D. M., M. M. Dacorogna, R. R. Dave, U. A. 

Miiller, R. B. Olsen, and O. V. Pictet (1997), "From the 

bird's eye to the microscope: a survey of new stylized facts 

of the intra-daily foreign exchange markets," Finance and 

Stochastics 1 (2), 95. 
Hall, A. D., and N. Hautsch (2006), "Order aggressiveness 

and order book dynamics," Empirical Economics 30 (4), 

973. 

Hamilton, J. D. (1994), Time series analysis, Vol. 2 (Cam- 
bridge University Press). 

Harris, L. (2003), Trading and exchanges: Market microstruc- 
ture for practitioners (Oxford University Press). 

Harris, L., and J. Hasbrouck (1996), "Market vs. limit orders: 
the SuperDOT evidence on order submission strategy," 
Journal of Financial and Quantitative Analysis 31 (2), 213. 

Hasbrouck, J. (1991), "Measuring the information content of 
stock trades," Journal of Finance 46 (1), 179. 

Hasbrouck, J., and G. Saar (2002), "Limit orders and volatil- 
ity in a hybrid market: The Island ECN," NYU Working 
Paper No. FIN-01-025, available at http://papers.ssrn. 
com/ sol3/papers . cf m?abstract_id=1294561 

Hautsch, N., and R. Huang (2009), "The market impact 
of a limit order," Humboldt Universitat Preprint, avail- 
able at http: //sfb649 ■ wiwi .hu-berlin.de/papers/pdf /| 
SFB649DP2009-051.pdf 

Hendershott, T., and C. M. Jones (2005), "Island goes dark: 
transparency, fragmentation, and regulation," Review of 
Financial Studies 18 (3), 743. 

Hendershott, T., C. M. Jones, and A. J. Menkveld (2011), 
"Does algorithmic trading improve liquidity?" The Journal 
of Finance 66 (1), 1. 

Hill, B. M. (1975), "A simple general approach to inference 
about the tail of a distribution," The Annals of Statistics , 
1163. 

HoUifield, B., R. A. Miller, and P. Sandas (2004), "Empirical 
analysis of limit order markets," The Review of Economic 
Studies 71 (4), 1027. 

HoUifield, B., R. A. Miller, P. Sandas, and J. Slive (2006), 
"Estimating the gains from trade in limit-order markets," 
The Journal of Finance 61 (6), 2753. 

Jones, C. M., G. Kaul, and M. L. Lipson (1994), "Transac- 
tions, volume, and volatility," Review of Financial Studies 
7 (4), 631. 

Kantelhardt, J. W., E. Kosciclny-Bunde, H. H. A. Rego, 
S. Havlin, and A. Bunde (2001), "Detecting long-range 
correlations with detrended fiuctuation analysis," Physica 
A 295 (3), 441. 

Kempf, A., and O. Korn (1999), "Market depth and order 
size," Journal of Financial Markets 2 (1), 29. 

Kyle, A. S. (1985), "Continuous auctions and insider trading," 
Econometrica 53 (6), 1315. 

La Spada, G., and F. Lillo (2011), "The effect of round-off 
error on long memory processes," arXiv:1107.4476. 



41 



Lillo, F. (2007), "Limit order placement as an utility maxi- 
mization problem and the origin of power law distribution 
of limit order prices," European Physical Journal B 55 (4), 
453. 

Lillo, F., and J. D. Farmer (2004), "The long memory of 

the efficient market," Studies in Nonlinear Dynamics and 

Econometrics 8 (3), 1. 
Lillo, F., J. D. Farmer, and R. N. Mantegna (2003), "Econo- 

physics: master curve for price-impact function," Nature 

421 (6919), 129. 
Lillo, F., S. Mike, and J. D. Farmer (2005), "Theory for long 

memory in supply and demand," Physical Review E 71 (6), 

066122. 

Liu, Y., P. Cizeau, M. Meyer, C. K. Peng, and H. Eu- 
gene Stanley (1997), "Correlations in economic time se- 
ries," Physica A 245 (3-4), 437. 

Liu, Y., P. Gopikrishnan, P. Cizeau, M. Meyer, C. K. Peng, 
and H. E. Stanley (1999), "Statistical properties of the 
volatility of price fluctuations," Physical Review E 60 (2), 
1390. 

Lo, A. W. (1989), Long-term memory in stock market prices, 
Tech. Rep. (National Bureau of Economic Research). 

Lo, A. W., and A. C. MacKinlay (2001), A non-random walk 
down Wall Street (Princeton University Press, Princeton, 
NJ). 

Lo, I., and S. G. Sapp (2010), "Order aggressiveness and 
quantity; how are they determined in a limit order mar- 
ket?" Journal of International Financial Markets, Institu- 
tions and Money 20 (3), 213. 

Luckock, H. (2001), "A statistical model of a limit order mar- 
ket," Sydney University Preprint, available at http://www. 

1^ maths .usyd.edu . au/res/AppMaths/Luc/2001-9 .pdf 

Luckock, H. (2003) , "A steady-state model of the continuous 
double auction," Quantitative Finance 3 (5), 385. 

Madhavan, A., D. Porter, and D. Weaver (2005), "Should 
securities markets be transparent?" Journal of Financial 
Markets 8 (3), 265. 

Maskawa, J. (2007), "Correlation of coming limit price with 
order book in stock markets," Physica A 383 (1), 90. 

Maslov, S. (2000), "Simple model of a limit order-driven mar- 
ket," Physica A 278 (3-4), 571. 

Maslov, S., and M. Mills (2001), "Price fluctuations from 
the order book perspective - Empirical facts and a simple 
model," Physica A 299 (1-2), 234. 

Mike, S., and J. D. Farmer (2008), "An empirical behavioral 
model of liquidity and volatility," Journal of Economic Dy- 
namics and Control 32 (1), 200. 

Mitchell, M. (2009), Complexity: A gmded tour (Oxford Uni- 
versity Press). 

Mittal, H. (2008), "Are you playing in a toxic dark pool?" 
The Journal of Trading 3 (3), 20. 

Mizrach, B. (2008), "The next tick on NASDAQ," Quantita- 
tive Finance 8 (1), 19. 

Mu, G. H., W. Chen, J. Kertesz, and W. X. Zhou (2009), 
"Preferred numbers and the distributions of trade sizes and 
trading volumes in the Chinese stock market," The Euro- 
pean Physical Journal B 68 (1), 145. 

Mu, G. H., and W. X. Zhou (2010), "Tests of nonuniversality 
of the stock return distributions in an emerging market," 
Physical Review E 82 (6), 066103. 

NADSAQ, (2010), Retrieved 20th September, 2011, from 
www . nasdaqtrader . com/content/productsservices/ 

1^ trading/psx/psxf acts .pdf 

Obizhaeva, A., and J. Wang (2005), "Optimal trading 



strategy and supply/demand dynamics," Working Paper, 
SSRN eLibrary, available at http : //pap ers . ssrn. comTl 
sol3/papers . cf m?abstract_id=752022 

Parlour, C, and D. J. Seppi (2008), "Limit order markets: 
a survey," in Handbook of Financial Intermediation and 
Banking, edited by A. Thakor and A. Boot (Elsevier). 

Parlour, C. A. (1998), "Price dynamics in limit order mar- 
kets," Review of Financial Studies 11 (4), 789. 

Peng, C. K., S. V. Buldyrev, S. Havlin, M. Simons, H. E. 
Stanley, and A. L. Goldberger (1994), "Mosaic organiza- 
tion of DNA nucleotides," Physical Review E 49 (2), 1685. 

Plerou, v., P. Gopikrishnan, X. Gabaix, and H. E. Stanley 
(2002), "Quantifying stock-price response to demand fluc- 
tuations," Physical Review E 66 (2), 27104. 

Plerou, v., and H. E. Stanley (2008), "Stock return distribu- 
tions: Tests of scaling and universality from three distinct 
stock markets," Physical Review E 77 (3), 037101. 

Potters, M., and J. P. Bouchaud (2003), "More statistical 
properties of order books and price impact," Physica A 
324, 133. 

Preis, T., S. Golke, W. Paul, and J. J. Schneider (2006), 
"Multi-agent-based order book model of financial markets," 
Europhysics Letters 75, 510. 

Preis, T., S. Golke, W. Paul, and J. J. Schneider (2007), "Sta- 
tistical analysis of financial returns for a multiagent order 
book model of asset trading," Physical Review E 76 (1), 
016108. 

Ranaldo, A. (2004), "Order aggressiveness in limit order book 
markets," Journal of Financial Markets 7 (1), 53. 

Rea, W., L. Oxley, M. Reale, and J. Brown (2009), "Es- 
timators for long-range dependence: an empirical study," 
arXiv:0901.0762. 

Robinson, P. M., Ed. (2003), Time series with long memory 
(Oxford University Press). 

Ro§u, I. (2009), "A dynamic model of the limit order book," 
Review of Financial Studies 22 (11), 4601. 

Ro§u, I. (2010), "Liquidity and information in or- 
der driven markets," Working paper, SSRN eLibrary, 
available at http : //pa pers . ssrn. coni/sol3/papers . cfm?| 
abstract_id=1286193 

Sandas, P. (2001), "Adverse selection and competitive market 
making: empirical evidence from a limit order market," 
Review of Financial Studies 14 (3), 705. 

Seppi, D. J. (1997), "Liquidity provision with limit orders and 
a strategic specialist," Review of Financial Studies 10 (1), 
103. 

SETS, (2011), Retrieved 20th September, 2011, from 



londonstockexchange . com/products-and-services/ 
trading-services/sets/sets .htm 

Shephard, N., Ed. (2005), Stochastic Volatility (Oxford Uni- 
versity Press). 

Slanina, F. (2001), "Mean-field approximation for a limit 
order driven market model," Physical Review E 64 (5), 
56136. 

Smith, E., J. D. Farmer, L. Gillemot, and S. Krishnamurthy 
(2003), "Statistical theory of the continuous double auc- 
tion," Quantitative Finance 3 (6), 481. 

Stanley, H. E., V. Plerou, and X. Gabaix (2008), "A sta- 
tistical physics view of financial fiuctuations: evidence for 
scaling and universality," Physica A 387 (15), 3967. 

Stumpf, M. P. H., and M. A. Porter (2012), "Critical truths 
about power laws," Science 335 (6069), 665. 

Tang, L. H., and G. S. Tian (1999), "Reaction-diffusion- 
branching models of stock price fiuctuations," Physica A 



42 



264 (3-4), 543. 

Taqqu, M. S., V. Teverovsky, and W. Willinger (1995), "Es- 
timators for long-range dependence: an empirical study," 
Fractals 3 (4), 785. 

Taylor, S. J. (2008), Modelling financial time series (World 
Scientific Publishing). 

Teverovsky, V., M. S. Taqqu, and W. Willinger (1999), "A 
critical look at lo's modified r/s statistic," Journal of Sta- 
tistical Planning and Inference 80 (1-2), 211. 

Thomson-Reuters, (2011), Retrieved 20th Sept ember, 2011, 
from https : // dxtrapu b. markets .renters . com/docs/| 

^ Matching_Rule_Book . pdf 

Toke, I. M. (2011), ""Market making" in an order book 
model and its impact on the spread," in Econophysics of 
Order-driven Markets: Proceedings of Econophys-Kolkata 
V, edited by F. Abergel, B. K. Chakrabarti, C. A., and 
M. M. (Springer, Milan) pp. 49-64. 

Wyart, M., J. P. Bouchaud, J. Kockelkoren, M. Potters, and 
M. Vettorazzo (2008), "Relation between bid-ask spread, 
impact and volatility in order-driven markets," Quantita- 



tive Finance 8 (1), 41. 

Xu, L., P. C. Ivanov, K. Hu, Z. Chen, A. Carbone, and 
H. E. Stanley (2005), "Quantifying signals with power-law 
correlations: a comparative study of detrended fluctuation 
analysis and detrended moving average techniques," Phys- 
ical Review E 71 (5), 051101. 

Zhao, L. (2010), A model of limit order book dynamics and 
a consistent estimation procedure, Ph.D. thesis (Carnegie 
Mellon University). 

Zhou, B. (1996), "High-frequency data and volatility in 
foreign-exchange rates," Journal of Business & Economic 
Statistics 14 (1), 45. 

Zhou, W. X. (2012), "Universal price impact functions of indi- 
vidual trades in an order-driven market," arXiv:0708.3198. 

Zovko, I., and J. D. Farmer (2002), "The power of patience: 
a behavioral regularity in limit order placement," Quanti- 
tative Finance 2 (5), 387. 

Zumbach, G. (2004), "How trading activity scales with com- 
pany size in the FTSE 100," Quantitative Finance 4 (4), 
441. 



43 



Appendix A: Table of Empirical Studies 



Reference 


Assets Studied 


Date Range 


Data Type 


Main Points Studied 


lAi't-S, 
(2005 


ihalia et all 
1 


The 30 Dow Jones Indus- 
trial Average stocks 


19th-23rd and 26th- 
30th April, 2004 


hit), a(t), nt(b(t),t) and 
na{a{t),t), and all market 
orders 


Volatility and long range 
dependence in order flows 


lAnand et all 


144 stocks traded on the 
NYSE 


November 1990 to 
January 1991 


All order flows at all prices 


Decision between using 
limit orders or market 
orders for informed traders 


(2005 


1 


iBandi and Russelll 


Ah stocks in the S&P 100 
index 


February 2002 


b{t), a{t), nb{b{t),t) and 
na{a{t),t), and all market 
orders 


Volatility 


(2006 


1 


Biais et al. ( 1995 I 


The CAC 40, traded on the 
Paris Bourse 


6 trading days in 
June/July 1991 and 
19 trading days in 
October/November 
1991 


Nt{p,t) and Na{p,t), for 
p — 0,1,2,3,4, updated 
every time one of them 
changes 


Returns, percentage of 
market orders that match 
against hidden liquidity, 
Nb{p), Na{p), and s{t) 
(both unconditionally and 
dependent on time of day) , 
order flow (both uncondi- 
tionally and dependent on 
recent order flow and time 
of day) and state of L(t) 




Biais et al. { 1999 1 


The CAC 40, traded on the 
Paris Bourse 


19 trading days in 
October/November 
1991, 26 trading days 
in 1993, and 234 
trading days in 1995 


Once-per-minute sampling 
of b(t) and a[t) 


Whether the evolution of 
the price process indicated 
learning on behalf of the 
market participants during 
the daily opening auction 


iBoehmer et al.\ 


400 stocks traded on the 
NYSE 


January 7th-18th, 
February 4th-15th, 
March 4th-15th, 
April lst-12th. May 
6th-17th, all in 2002 


All order flows at all prices 
in the electronic LOB, 
plus information about the 
handling of both elec- 
tronic and manual (broker- 
handled) orders 


How the introduction of 
an electronic LOB on the 
NYSE aff'ected market par- 
ticipants' behaviour 


(2005 


1 


iBortoli et all 


The 4 most actively traded 
futures contracts on the 
Sydney Futures Exchange 


September 15th 2000 
to June 19th 2001 


Every matching, change in 
b{t) or a{t), and change 
in depth available at the 
best prices (respectively, 
best three prices) prior to 
(respectively, after) the 
change in disseminated 
market information, times- 
tamped to the nearest 
second 


Whether order flow and 
the state of the LOB 
changed when the Syd- 
ney Futures Exchange in- 
creased the real-time in- 
formation disseminated to 
market participants, from 
only the depths available at 
b{t) and a{t) to the depths 
available at the best three 
prices on each side of the 
LOB 


(2006 


1 


iBouchaud et all 


France Telecom, Vivendi, 
and Total stocks traded on 
the Paris Bourse 


February 2001 


All order arrivals at all 
prices along with their time 
of arrival, and a list of all 
orders that were cancelled 
(but not the time at which 
they were cancelled) 


Ni,{p), Na{p), and the dis- 
tribution of relative price, 
order size, nt{b{t),t), and 
na{a{t), t) 


(2002 


1 


IBouchaud and 


France Telecom Stock, 
traded on the Paris Bourse 
(but similar results re- 
ported for other unnamed 
liquid French and British 
stocks) 


Trading days during 
2001 and 2002 


b{t), a{t), nt{b{t),t) and 
na{a{t),t), recorded once 
every time any of these 
changes and timestamped 
to the nearest second, and 
all market orders, times- 
tamped to the nearest 
second 


How order flow affected 
prices 


Potters (20031 





44 



Cao et al. ( 


20081 


100 largest stocks traded 
on the Australian Stock 
Exchange 


March 2000 


All order arrivals and can- 
cellations at all prices, 
timestamped to the nearest 
0.01 seconds 


How the state of the LOB 
affected order flow 




IChakraborti et al. 


Four stocks traded on the 
Paris Bourse 


All trading days be- 
tween 1st October 
2007 and 30th May 
2008 


All market orders, plus five 
highest priority active or- 
ders on each side of the 
LOB 


Whether the traditional 
stylized facts were present 
in the data 


1 2011b 1 


IChallct and 
rStinchcombel 


Four stocks traded on the 
Island ECN (on NASDAQ) 


Not specified 


15 highest priority active 
orders on each side of the 
LOB, updated every time 
the list changed 


Order flow rates, autocor- 
relation of order flow rates, 
diffusion of active orders 
(i.e., cancellation of an ac- 
tive order immediately fol- 
lowed by resubmission at 
a neighbouring price), in- 
stantaneous price impact; 
distribution of order size, 
lifetime of limit orders, rel- 
ative price for incoming 
orders 


(2001 


1 


Cont et al. 


(20101 


Sky Perfect Communica- 
tions stock, traded on the 
Tokyo Stock Exchange 


Not specified 


Nb{p,t) and Naip,t) for 
the five smallest relative 
prices with nonzero depth 
available, updated ev- 
ery whenever either one 
changed, and all market 
orders 


Arrival rates of market or- 
ders and arrival and cancel- 
lation rates of limit orders 




Cont et al. 


(20111 


50 stocks chosen at random 
from the S&P 500, traded 
on the NYSE 


All 21 trading days in 
April 2010 


/7/,\ ,\ ■■ / /f\ j\ 

nb[b{t),t) and na{a{t),t), 
updated whenever either 
one changed, with a times- 
tamp rounded to the near- 
est second, and all market 
orders 


Relationship between or- 
der fiow imbalance and 
price impact 




Dufour and Engle 


18 of the most frequently 
traded stocks on the NYSE 


62 trading days be- 
tween 1st November 
1990 and 31st Jan- 
uary 1991 


b{t) and a{t), updated ev- 
ery time they change, and 
all market orders 


Relationship between 
market order inter-arrival 
times and price impact 


(2000 


1 


Eisler et al. 


(20101 


14 randomly selected 
stocks traded on NASDAQ 


The 53 trading days 
between 3rd March 
2008 and 19th May 
2008 


b{t), a{t), nt{b{t),t) and 
na{a{t),t), updated every 
time any of them change 


The price impact of mar- 
ket order submissions and 
limit order submissions 
and cancellations 




Ellul et al. 


(20031 


The 50 most actively 
traded stocks and 98 
randomly chosen stocks on 
the NYSE 


The 5 trading days 
between 30th April 
2001 and 5th May 
2001 


All market order submis- 
sions and all limit order 
submissions and cancella- 
tions, timestamped to the 
nearest second 


What factors market par- 
ticipants used when choos- 
ing the price of their orders 




Engle and Patton 


100 randomly selected 
stocks traded on the 
NYSE 


18 months of data, no 
date range specified 


b{t) and a{t), updated ev- 
ery time they change, and 
all market orders 


s{t) and how price impact 
varied according to how 
frequently trades occur for 
a specific stock 


(2004 


1 


iFarmer and Lillo 


3 stocks traded on the LSE 
and 3 stocks traded on the 
NYSE 


May 2000 to Decem- 
ber 2002 for the LSE 
stocks and 1995-1996 
for the NYSE stocks 


All order fiows for the LSE; 
b{t) and a{t), updated ev- 
ery time they change, and 
all market orders, for the 
NYSE 


Price impact of individual 
market orders, and distri- 
bution of order sizes for 
market orders 


(2004 


1 



45 



iFarmer et al.\ 


11 stocks traded on the 
LSE 


All 434 trading days 
between 1st August 
1998 and 30th April 
2000 


All market order sub- 
missions and all limit 
order submissions and 
cancellations 


Goodness of fit of the 
predictions regarding mean 
spread and price diffu- 
sion of the ISmith et all 
( 2003 1 model to data, and 
mean instantaneous mid- 
price logarithmic return 
impact as a function of 
market order size 


(2005 




Field and Large 


Short Sterhng, Euribor, 
Eurodollar, and 2- Year US 
Treasury Note futures 


23rd November to 
11th December 2006 
and 16th to 20th 
April 2007 


h{t), a{t), nt{b{t),t) and 
na{a{t),t), updated every 
time any of them change 


Order flow rates and 
nb{b{t),t) and na{a{t),t) 
in markets where s{t) = Sp 


(2008 




IGode and Sunder 


Laboratory experiment 
with human beings 
and computerized zero- 
intelHgence traders 


N/A 


All order flows at all prices 


Relative applicability of 
perfect-rationality and 
zero-intelligence assump- 
tions, and emergence 
of seemingly rational 
behaviour when aggre- 
gating across irrational 
individuals 


(1993 




Gopikrishnan 


1000 largest stocks traded 
in the US 


1994-1995 


a{t), b{t), and all market 
orders 


Price impact as a function 
of trade imbalance count 
and trade imbalance size, 
and distribution and auto- 
correlation of trade imbal- 
ance count and trade im- 
balance size 


et al. 


(2000 


1 






Gu et al. ( 2008a \ 


Aggregation of 23 stocks 
traded on the Shenzhen 
Stock Exchange 


The whole of 2003 


All order flows at all prices 


Distribution of mid-price 
returns on various r sec- 
ond timescales and various 
event-by-event timescales 




Gu et al. ( 2008b ] 


Aggregation of 23 stocks 
traded on the Shenzhen 
Stock Exchange 


The whole of 2003 


All order flows at all prices 


Distribution of relative 
prices of incoming or- 
ders, and whether this 
is conditional on s(t) or 
volatility 




Gu et al 


( 2008c 1 


23 stocks traded on the 
Shenzhen Stock Exchange 


The whole of 2003 


All order flows at all prices 


Nb{p), Naip), and changes 
in relative depth profiles 
through time 




iGu and Zhoui 
(2009a) 


23 stocks traded on the 
Shenzhen Stock Exchange 


The whole of 2003 


All order flows at all prices 


Autocorrelation of relative 
prices of incoming orders 


iHall and Hautsch 


The 5 most liquid stocks 
traded on the Australian 
Stock Exchange 


July to August 2002 


All order flows at all prices 


Whether the distribution 
of relative prices of incom- 
ing orders was conditional 
on the state of the LOB, 
volatility, and recent order 
flows 


(2006 




iHarris and Has- 


144 randomly selected 
stocks traded on the 
NYSE 


November 1990 to 
January 1991 


All order flows at all prices 


Analysis of performance 
measures aiding decision- 
making between limit or- 
ders versus market orders 


brouc 


^ ( 


19961 




[Hasbrouck 
[Saar, (,2002 


andi 


The 300 largest equities on 
NASDAQ, traded on Is- 
land ECN 


1st October to 31st 
December 1999 


All 1 n ^11 • 

All order flows at all prices 


How volatility was related 
to order flow and the state 
of the LOB, and how order 
fill probabilities and mean 
time to execution varied 
with volatility 



46 



iHautsch and 


The 30 most frequently 
traded stocks on Euronext 
Amsterdam 


All trading days be- 
tween 1st August and 
30th September, 2008 


Nb{p,t) and Na{p,t) for 
p = 0,1,2, updated ev- 
ery whenever either one 
changed, and a record of 
all trades that actually oc- 
curred, timestamped to the 
nearest millisecond 


Market impact of incoming 
limit orders 


Huang 


; (20091 




iHendershott and 


3 exchange traded funds on 
Island ECN 


16th August to 31st 
October 2002 


For activity on Island: for 
the first part of the data, 
b{t), a{t), nt{b{t),t) and 
na{a{t),t), updated every 
time any of them change, 
and all market orders; for 
the second part of the 
data, only market orders; 
for activity not on Island, 
b{t), a{t), nb{b{t),t) and 
na{a{t),t), updated every 
time any of them change, 
and all market orders, for 
the entire data period 


The effect that showing 
L{t) to market participants 
had on price series 


Jones 


(20051 


IHendershott et al.\ 


943 stocks traded on the 
NYSE 


February 2001 to De- 
cember 2005 


b{t), a{t), nb{b{t),t) and 
na[a{t),t), updated every 
time any of them change 


The effects of algorithmic 
trading on L{t) 


(2011 




iHollifield et all 


The Ericsson stock, traded 
on the Stockholm Stock 
Exchange 


The 59 trading days 
between 3rd Decem- 
ber 1991 and 2nd 
March 1992 


All order fiows at all prices 


Whether market partici- 
pants' actions could be ex- 
plained by a cut-off strat- 
egy bcised on their private 
valuation of the asset 


(2004 




IHollifield et all 


3 stocks traded on the Van- 
couver Stock Exchange 


May 1990 to Novem- 
ber 1993 


All order flows at all prices 


Distribution of traders' 
personal valuations, in- 
ferred from their actions 


(2006 




Kempf and Korn 


DAX futures contracts, 
traded on the German 
Futures and Options 
Exchange 


17th September 1993 
to 15th September 
1994 


b{t), a{t) and all market 
orders 


Permanent price impact, 
as a function of several 
measures of trade imbal- 
ance, over 1 minute time 
horizons 


(1999 




iLillo and Farmer 


20 stocks traded on the 
LSE 


1999 to 2002 


All order flows at all prices 


Autocorrelation of or- 
der sizes, mid prices, 
nt(b{t),t), na{a{t),t), and 
order type (buy or sell) 
for arriving LOs, arriving 
MOs, and cancelled LOs 


(2004 




Lillo et al. 


(20051 


20 stocks traded on the 
LSE 


May 2000 to Decem- 
ber 2002 


All LOB order flows and 
all off-book trades for the 
same stocks 


Effects of order splitting 
and hidden liquidity on ob- 
served order flows 




Lillo ( 


2007 




Astrazeneca Stock, traded 
on the LSE 


May 2000 to Decem- 
ber 2002 


Order arrivals, partitioned 
by who submitted them 


Distribution of relative 
prices for incoming limit 
orders from specified 
market participants 






Lo and Sapp 


Deutsche Mark/US dollar 
and Canadian dollar/US 
dollar currency pairs 


5th October to 10th 
Uctober 199/ tor 
Deutsche Mark/US 
dollar; 1st May to 
30th June 2005 for 
Canadian dollar/US 
dollar 


All order flows at all prices 


How market participants 
chose the size and relative 
prices of their orders 


(2010 





47 



iMadhavan et ail 


109 stocks traded via a 
LOB and 240 stocks traded 
by floor traders on the 
Toronto Stock Exchange 


March and May, 1990 


For March, b{t), a{t), 
na[a(t),t), and nb{b(t),t); 
for May, b{t), a{t), Nb{p,t), 
and Na(p,t), for p = 
0,1,2,3,4, and all mar- 
ket orders; all floor-trader 
trades for both months 


How real-time disclosure 
of more information about 
the depth profile aff'ectcd 
market participants' 
behaviour 


(2005 


1 


Maskawa 


(2007 




13 stocks traded on the 
LSE 


July to December 
2004 


All order flows at all prices 


Distribution of relative 
prices for incoming limit 
orders, and whether this 
distribution was affected 
by the state of L{t) 






iMaslov and Millsl 


Cisco Systems, Broadcom 
Corporation, and JDS 
Uniphase Corporation 
stocks traded on NASDAQ 


30th June 2000 for 
Cisco Systems stock; 
3rd July July for 
Broadcom Corpora- 
tion stock; and 5th, 
6th, and 11th July for 
JDS Uniphase Corpo- 
ration stock 


h{t), a{t), Nt{p,t) and 
N^{p,t) for p = 0,1,2,3, 
and all market orders 


Distribution of order sizes, 
ni,{b{t),t), {na{a{t),t), 
depth profiles, instanta- 
neous price impact 


(2001 


1 


iMike and Farmer 


25 stocks traded on the 
LSE 


May 2000 to Decem- 
ber 2002 


All order flows at all prices 


Relative prices of incoming 
orders, autocorrelation of 
order type in order flows, 
order cancellations 


(2008 


1 


Mizrach ( 


20081 




The 4 largest stocks on 
NASDAQ; 95 of the "NAS- 
DAQ 100" stocks; and 
87 other randomly chosen 
smaller NASDAQ stocks 


December 2002 


All order flows at all prices 


How L{t) affected the next 
change in b{t) or a{t) 




Mu et al. 


(2009 




22 stocks traded on the 
Shenzhen Stock Exchange 


The whole of 2003 


All order flows at all prices 


Distribution of market or- 
der sizes 






iMu and Zhoul 


978 stocks traded on the 
Shenzhen Stock Exchange 


January 2004 to June 
2006 


b{t) and a{t), updated once 
every 6 to 8 seconds 


Distribution of mid-price 
logarithmic returns for 
stocks in emerging mar- 
kets, and how this varied 
with time window and 
market capitalization of 
the stock studied 


(2010 


1 


iPlerou et all 


The 116 most frequently 
traded US stocks 


1994 to 1995 


b{t), a{t), nt{b{t),t) and 
na{a{t),t), and all market 
orders 


Price impact as a func- 
tion of trade imbalance 
count and trade imbalance 
size, over a variety of time 
horizons 


(2002 


1 


IPlerou and Stan- 


1000 major US stocks; 85 
of the FTSE 100 stocks 
(traded on the LSE); 13 of 
the CAC 40 stocks (traded 
on the Paris Bourse); 422 
stocks from the Center for 
Research in Security Prices 

(U-ttb-T ) 


1994-1995 for US 
stocks; 2001-2002 
for LSE stocks; 3rd 
Jan 1995-22nd Oct 
1999 for Paris Bourse 
stocks; Jan 1962- 
Dec 1996 for CRSP 
database stocks 


All market orders 


Distribution of mid-price 
returns and number of ar- 
riving market orders, and 
whether they varied ac- 
cording to market capi- 
talization or industry sec- 
tor, on various r second 
timescales 


ley (2008 








iPotters and 


Exchange traded funds 
that track NASDAQ and 
the S&P 500, and the 
Microsoft stock 


1st June to 15th July, 
2002 


All order flows at all prices 


Distribution of relative 
prices, relative depth 
profiles, arrival and cancel- 
lation rates, instantaneous 
price impact 


Bouchauc 


(2002 







48 



Ranaldo (20041 


15 stocks traded on the 
Swiss Stock Exchange 


March and April 1997 


b{t), a{t), nt{b{t),t) and 
na{a{t),t), and all market 
orders 


How volatility, recent or- 
der flow, and the state of 
L{t) affected order flow, in- 
traday patterns in spread 
and volatility, symmetry 
between the buy and sell 
sides of the LOB 




Sanda 


s ( 2001 1 


10 stocks traded on the 
Stockhohn Stock Exchange 


59 trading days be- 
tween 3rd December 
1991 and 2nd March 
1992 


All order flows at all prices 


Whether the depth pro- 
file supported hypotheses 
about how market partic- 
ipants make decisions re- 
lated to order submissions 
and cancellations 


Toke 


2011 




3 stocks from the CAC40, 
3 month Euribor futures, 
and FTSE 100 futures 


10th September 2009 
to 30th September 
2009 


Nt(p,t), and Na{p,t), 
for p = 0,1,2,3,4, up- 
dated whenever either one 
changed, timestamped to 
the millisecond 


Whether Hawkes processes 
provided a better explana- 
tion of order flows than do 
Poisson processes 






|Wyart et al.\ 


The 68 most liquid stocks 
on the Paris Bourse, small 
tick index futures con- 
tracts, and the 155 most 
actively traded stocks on 
the NYSE 


2002 for the Paris 
Bourse, 2005 for the 
small tick futures and 
NYSE stocks 


T/i\ /i\ / T / I \ I \ 1 

b{t), a(t), nb(b[t),t) and 
na{a{t),t), and all market 
orders 


How the profit of a market 
maker trading in a LOB 
depended on s{t), and price 
impact 


(2008 




Zhao 


2010 


1 


Crude oil futures contracts, 
traded on the International 
Petroleum Exchange 


17th October 2005 


Nb{p,t), and Na{p,t), 
for p = 0,1,2,3,4, up- 
dated whenever either one 
changed, and all market 
orders, timestamped to 
the nearest second 


Order flow rates 






Zhou 


1996 


1 


Deutsche Mark/US dol- 
lar, US dollar/ Yen and 
Deutsche Mark/ Yen cur- 
rency pairs, traded on 
Reuters 


1st October 1992 to 
30th September 1993 


b{t) 


Volatility 






Zhou| (j2012|) 


23 stocks traded on the 
Shenzhen Stock Exchange 
(although 1 is later re- 
moved as its price was re- 
ported to be manipulated 
in the data) 


The whole of 2003 


All order flows at all prices 


Instantaneous price impact 
of individual orders 


IZovko and Farmer 


50 stocks traded on the 
LSE 


1st August 1998 to 
31st April 2000 


Relative prices of incoming 
limit orders 


Relative prices of incoming 
orders, autocorrelation of 
order type in order flows, 
and volatility 


(2002 





